An RFM Model Customizable to Product Catalogues and Marketing Criteria Using Fuzzy Linguistic Models: Case Study of a Retail Business

: In the ﬁeld of strategic marketing, the recency, frequency and monetary (RFM) variables model has been applied for years to determine how solid a database is in terms of spending and customer activity. Retailers almost never obtain data related to their customers beyond their purchase history, and if they do, the information is often out of date. This work presents a new method, based on the fuzzy linguistic 2-tuple model and the deﬁnition of product hierarchies, which provides a linguistic interpretability giving business meaning and improving the precision of conventional models. The fuzzy linguistic 2-tuple RFM model, adapted by the product hierarchy thanks to the analytical hierarchical process (AHP), is revealed to be a useful tool for including business criteria, product catalogues and customer insights in the deﬁnition of commercial strategies. The result of our method is a complete customer segmentation that enriches the clusters obtained with the traditional fuzzy linguistic 2-tuple RFM model and offers a clear view of customers’ preferences and possible actions to deﬁne cross-and up-selling strategies. A real case study based on a worldwide leader in home decoration was developed to guide, step by step, other researchers and marketers. The model was built using the only information that retailers always have: customers’ purchase ticket details.


Introduction
We live in a fast-changing digital world. In today's age, customers expect sellers to talk directly with them and offer the perfect product, with the right message and at the correct time. Big Data analytics have an immense potential to empower customer experience management, as they can help organizations to achieve a better and faster understanding of the customer journey and make decisions to improve the customer experience (Wedel and Kannan [1]).
There are many organizations still learning how to capture data from the multitude of available touchpoints, devices, media and applications (Maechler et al. [2]). In some cases, even if they have the data, organizations still face difficulties in understanding and managing those data and generating relevant insights (Said et al. [3]). Information about customers is achieved through the use of analytics (Wedel and Kannan [1]), and despite its use becoming more common, many companies still use basic and poor analytics to extract information from their customers (Moorman [4]; Ramsbotham et al. [5]).
Digital transformation has been a revolution in the way companies manage their business and also in the way they manage relationships with customers, employees, suppliers, and other stakeholders (Bresciani et al. [6]; Scuotto et al. [7]).

Materials and Methods
In this section, we summarize the literature review, including a timeline with important improvements for the RFM model; we also include the theoretical contents and previous works that we consider essential to be able to follow our proposal.

Literature Review
The RFM model is a very well-known technique that is defined by three measures (recency, frequency and monetary), which are normally divided into five equal quintiles (20% group) and combined into a three-digit RFM cell code (Bult and Wansbeek [20]; Bitran and Mondschein [29]; Miglautsch [30]; Chang et al. [31]; Miglautsch [32]). According to our experience and prior findings, RFM values could be firm-specific and are based on the nature of the products and customers' behaviour (Lumsden et al. [33]). Many authors have attempted to improve the original RFM model. Wei et al. [34] prepared a summary of these improvements covering RFM different versions until 2010, and Ernawati et al. [35] continued that work and summarized new versions and improvements to the RFM model from 2015 to 2021. One of the proved improvements of the traditional RFM model was the fuzzy linguistic 2-tuple RFM model (Carrasco et al. [36,37]; Martínez et al. [38]).
As we already mentioned, CLV is the value a customer contributes to a business over their entire lifetime at the company. By making use of CLV, companies tend to Mathematics 2021, 9,1836 4 of 31 place emphasis on long-term customer satisfaction and loyalty instead of maximizing short-term relations and sales (Gupta et al. [22]; Kumar [39]; Fader et al. [40]). There are many publications related to other models that attempt to calculate customers' behaviour (Bolton [41]; Baesens et al. [42]; Malthouse [43]; Berry and Linoff [44]; Malthouse and Blattberg [45]; Rud [46]; Zhang et al. [47]), but CLV estimation based on recency, frequency and monetary (RFM) values remains the most used.
Our proposal is an extension of the fuzzy linguistic 2-tuple RFM model, and it is applied to both historical and current products purchased by customers, to calculate the customer value based on the RFMScore. Wong and Wei [48] calculated the weighted RFMScore. The weight determination for each variable of the RFM to create the RFMScore depends on the factor's importance in the application (Dursun and Caber [49]; Peker et al. [50]); some authors applied the same weights to each attribute (Peker et al. [50]; Hamdi and Zamiri [51]; Weng [52]), but other researchers applied the analytical hierarchical process (AHP) to define the correspondence weights, such as Moghaddam et al. [25], He and Li [53], Rezaeinia and Rahmani [54], Marisa et al. [55], Patel et al. [56], Hosseini and Mohammadzadeh [57], Dachyar et al. [58] and Monalisa et al. [59]. We will take advantage of their findings and use AHP to define the different weights of each RFMScore per product category, obtaining a more complete approach to customer preferences and customer value.
Taking advantage of customers' purchase behaviour, we define a new hierarchy of products that better responds to customers' needs, not only to business needs, and will help retailers to better determine their customers' preferences. We applied PCA to discover patterns in product original hierarchies and aggregated a huge number of product dimensions into a more manageable number that fitted customers preferences (Maćkiewicz, and Ratajczak [60]; Karamizadeh et al. [61]; Abdi and Williams [62]; Paul et al. [63]; Bryant and Yarnold [64]). Figure 1 shows, in a timeline, a summary of the main RFM model improvements. The publications that have a direct relation with our work are highlighted in bold.
have attempted to improve the original RFM model. Wei et al. [34] prepared a summary of these improvements covering RFM different versions until 2010, and Ernawati et al. [35] continued that work and summarized new versions and improvements to the RFM model from 2015 to 2021. One of the proved improvements of the traditional RFM model was the fuzzy linguistic 2-tuple RFM model (Carrasco et al. [36,37]; Martínez et al. [38]).
As we already mentioned, CLV is the value a customer contributes to a business over their entire lifetime at the company. By making use of CLV, companies tend to place emphasis on long-term customer satisfaction and loyalty instead of maximizing short-term relations and sales (Gupta et al. [22]; Kumar [39]; Fader et al. [40]). There are many publications related to other models that attempt to calculate customers' behaviour (Bolton [41]; Baesens et al. [42]; Malthouse [43]; Berry and Linoff [44]; Malthouse and Blattberg [45]; Rud [46]; Zhang et al. [47]), but CLV estimation based on recency, frequency and monetary (RFM) values remains the most used.
Our proposal is an extension of the fuzzy linguistic 2-tuple RFM model, and it is applied to both historical and current products purchased by customers, to calculate the customer value based on the RFMScore. Wong and Wei [48] calculated the weighted RFM-Score. The weight determination for each variable of the RFM to create the RFMScore depends on the factor's importance in the application (Dursun and Caber [49]; Peker et al. [50]); some authors applied the same weights to each attribute (Peker et al. [50]; Hamdi and Zamiri [51]; Weng [52]), but other researchers applied the analytical hierarchical process (AHP) to define the correspondence weights, such as Moghaddam et al. [25], He and Li [53], Rezaeinia and Rahmani [54], Marisa et al. [55], Patel et al. [56], Hosseini and Mohammadzadeh [57], Dachyar et al. [58] and Monalisa et al. [59]. We will take advantage of their findings and use AHP to define the different weights of each RFMScore per product category, obtaining a more complete approach to customer preferences and customer value.
Taking advantage of customers' purchase behaviour, we define a new hierarchy of products that better responds to customers' needs, not only to business needs, and will help retailers to better determine their customers' preferences. We applied PCA to discover patterns in product original hierarchies and aggregated a huge number of product dimensions into a more manageable number that fitted customers preferences (Maćkiewicz,and Ratajczak [60]; Karamizadeh et al. [61]; Abdi and Williams [62]; Paul et al. [63]; Bryant and Yarnold [64]). Figure 1 shows, in a timeline, a summary of the main RFM model improvements. The publications that have a direct relation with our work are highlighted in bold.  It can be seen how Hughes [19] first defined the RFM model in 1994; Bult and Wansbeek [20] first introduced the use of the RFMScores; Suh et al. [65] combined the RFM model with data mining algorithms; Miglautsch [30] used the RFMScores to perform the first customer segmentations; Kaymak [66] introduced the concept of Fuzzy RFM by seg-Mathematics 2021, 9, 1836 5 of 31 menting with the fuzzy c-means algorithm; Hsieh [67] was the first to modify the variables R, F and M to ensure the application of this model to a particular business. In the same year, Tsai and Chiu [68] introduced the concept of weighted RFM. In 2005, Buckinx and Van den Poel [69] introduced the length dimension to the model; Fader et al. [40] described enriching it with the CLV; and Liu and Shih applied the AHP to calculate the weights of the variables R, F and M to define CLV and applied the results to customer segmentation. They also calculated association rules for the construction of a collaborative recommender system.
In 2008, Yeh et al. [21] added the variable of time. In 2009, Coussement and Van den Poel [70] introduced emotions in the model; Chen et al. [71] enriched the model with the Apriori algorithm; in 2010, Hosseini et al. [72] applied the model to a B2B business and entered the variable period for client activity; Li et al. [73] introduced pointwise mutual information; and Sekhavat et al. [74] added the duration variable. In 2014, significant improvements were made by Albadvi et al. [75] who applied fuzzy WRFM with a pareto/NBD distribution to segment and estimate the future CLV. In 2015, Carrasco et al. [36] introduced the linguistic 2-tuple RFM model; Güçdemir and Selim [76] applied the AHP model in an interesting way to weight the customer segments they obtained; and Zhang et al. [47] enriched the model with cumpliness.
In 2016, the segment of Dursun and Caber [49] included the dimension of seasonality; He and Li [53] enriched it by entering users' satisfaction into e-commerce websites; Hosseini and Mohammadzadeh [57] included length in the model; and Song et al. [77] introduced an interesting element, time, as a dynamic dimension. In 2017, an interesting contribution for our research occurred: Moghaddam et al. [25] introduced product information although only through the variable V related to the variety of products. Peker et al. [50] added length and periodicity. In 2018, Li et al. [78] applied k-means to segment clients with the enriched model thanks to the length and membership duration. In 2019, Heldt et al. first directly described the product [24]. They estimated the future CLV for each product. Martinez et al. [38] demonstrated the improvement in the results of customer segmentation attributable to the linguistic 2-tuple model. In 2021, we have models, such as the PRFM of Hajmohamad et al. [79], which works on the profit margin, and that of Hwan and Lee [80], which applies the TexRank algorithm to improve the RFM by including website-specific weights and is thus able to work with clients without their purchase history. Chen and Huang [81] introduced the discretization of variables as an improvement to the model, and Bueno et al. [82] improved it by introducing opinion aggregations.

The 2-Tuple Fuzzy Linguistic Model
The fuzzy linguistic 2-tuple approach (Herrera and Martínez [83]) is a continuous model of information representation (Herrera and Herrera [84]) that has been used in many business and management applications. This model carries out processes of "computing with words" without a loss of information, which is typical of other fuzzy linguistic approaches. Henceforth, we explain the basic notations and operational details to explain our proposal.
Let S = {s 0 , . . . , s T } be a linguistic term set with odd cardinality, where the mid-term represents the neutral value, and the rest of the terms are symmetric with respect to it. We assume that the semantics of labels are given by means of triangular membership function µ Si , [0, 1] → [0, 1], and consider all terms distributed on a scale on which a total order is defined, i.e., s i ≤ s j ⇔ i < j. This portrayal is accomplished by the 3-tuple (i,j,k), where j is the mark where the membership is 1, and i and k are the left and right limits of the definition domain of the triangular membership function, respectively. Figure 2 represents the semantics assigned in five terms via triangular membership function, where: defined, i.e., si ≤ sj ⇔ i < j. This portrayal is accomplished by the 3-tuple (i,j,k), where j is the mark where the membership is 1, and i and k are the left and right limits of the definition domain of the triangular membership function, respectively. Figure 2 represents the semantics assigned in five terms via triangular membership function, where: VB = very bad = (0, 0, 0.25); B = bad = (0, 0. 25, 0.5); N = neutral = (0. 25, 0.5, 0.75); G = good = (0. 5, 0.75, 1); and VG = very good = (0. 75,1,1). The fuzzy linguistic 2-tuple approach is developed from the concept of symbolic translation by representing the linguistic information by means of 2-tuple (si, αi), si ∈ S and αi ∈ [−0. 5, 0.5), where si represents the information linguistic label, and αi is a numerical value expressing the value of the translation from the original result b to the closest index label, i, in the linguistic term set S. The value (si, αi) can also be represented as si ± αi (+ordepending on the sign of αi).
This model defines a set of transformation functions between numeric values and 2-tuple: where round () is the usual round operation, si has the closest index label to b and α is the value of the symbolic translation.
For all ∆, there exists ∆ −1 , defined as follows: The negation operator is defined as follows: In this fuzzy linguistic context, if a symbolic method aggregating linguistic information (Herrera and Herrera [84]) obtains a value b ∈ [0, T], and b / ∈ {0, . . . , T}, then an approximation function is used to express the result in S. [83]) Let b be the result of an aggregation of the indexes of a set of labels assessed in a linguistic term set S, i.e., the result of a symbolic aggregation operation, b ∈ [0, T]. Let i = round(b) and α = b − i be two values, such that i ∈ [0, T] and α ∈ [−0.5, 0.5), then α is called a symbolic translation.

Definition 1. (Herrera and Martínez
The fuzzy linguistic 2-tuple approach is developed from the concept of symbolic translation by representing the linguistic information by means of 2-tuple (s i , α i ), s i ∈ S and α i ∈ [−0.5, 0.5), where s i represents the information linguistic label, and α i is a numerical value expressing the value of the translation from the original result b to the closest index label, i, in the linguistic term set S. The value (s i , α i ) can also be represented as s i ± α i (+or-depending on the sign of α i ).
This model defines a set of transformation functions between numeric values and 2-tuple: Definition 2. (Herrera and Martínez [83]) Let S = {s 1 , . . . , s T } be a linguistic term set and b ∈ [0, T] a value representing the result of a symbolic aggregation operation, then the 2-tuple that expresses the equivalent information to b is obtained with the following function: where round (·) is the usual round operation, s i has the closest index label to b and α is the value of the symbolic translation.
Information aggregation consists of obtaining a value that summarizes a set of values. Hence, the result of the aggregation of a set of 2-tuples must be a 2-tuple. Using the functions ∆ and ∆ −1 that transform numerical values into linguistic 2-tuples and vice versa without the loss of information, any of the existing aggregation operators can be easily extended for dealing with linguistic 2-tuples.

The Fuzzy Linguistic 2-Tuple RFM Model
The RFM model was first proposed by Hughes in 1994 [19]. It is a popular tool of customer value analysis and has been extensively used for measuring customer lifetime value (Cheng and Chen [86]) and in customer segmentation and behaviour analysis (Chen et al. [87]). The RFM analytic approach is a common model that identifies customer purchase behaviour and differentiates important customers from large data by three variables: The aim, therefore, is to categorize each customer by means of scores based on these three variables, typically based on quintiles (5 represents 20% of the best customers in that variable, and 1 represents 20% of the worst), from which a unique score is calculated which represents the customer's value. However, these scores are not very precise, so in Carrasco et al. [37], Martinez et al. [38], an improvement in the RFM is proposed, which consists of representing these scores using the fuzzy linguistic 2-tuple model. The stages of this proposal are explained below:

1.
Data collection: let U = {u 1 , . . . , u #U } be the set of customers who have made at least one purchase over a pre-established analysis period. Let T = {(u 1 , d 1 , a 1 ), . . . , (u #T , d #T , a #T )} be the details of transactions or purchases made by such customers in this period, where the u i ∈ U identifies the customer of such a purchase on the date d i for the amount of a i .

2.
Customer aggregation: in this phase, T is aggregated at the customer level, obtaining the set TU = {(u 1 , r 1 , f 1 , m 1 ), . . . , (u #U , r #U , f #U , m #U )}, where r e would be the days since the last purchase of the customer u e (using a later fixed reference date for all customer purchases), f e is the number of times the customer has purchased and m e contains the total amount of these purchases.
Therefore, the following variables are calculated: R e , F e , M e , RFM i ∈ S × [−0.5, 0.5). For each customer u e , i = 1, . . . , #U, we obtain A e = (A e1 , A e2 , A e3 ) with A e1 = R e , A e2 = F e and A e3 = M e . First, customers are sorted in ascending order according to each of the individual components B e = (B e1 , B e2 , B e3 ), with B e1 = r e , B e2 = f e and B e3 = m e , contained in TU. Now, we define rank ei ∈ {1, . . . , #U} as the ranking of each client with respect to each of these variables: with percent_rank ei ∈ [0, 1], e = 1, . . . , #U and i = 1, . . . , 3. The final 2-tuple score A ei is obtained as follows: where ∆(·) and neg(·) are defined in Section 2.2.1 (Equations (1) and (3)). We use the negation function on recency, as the larger scores represent the most recent buyers. The 2-tuple RFM e , which characterizes together the R e , F e and M e scores, is calculated for each customer using Equation (5) as RFM e =Ā w [A ei ], with the user-defined weights W = {w R , w F , w M } previously defined by the marketing experts.

Analytical Hierarchical Process (AHP)
This technique is a systematic and hierarchical method to help the decision maker to solve complex multicriteria decision making (MCDM) problems, which involves ranking alternatives. The AHP model has been widely used to calculate the customer lifetime value by applying the AHP to define the importance of the RFM variables (Liu and Shih [88]). To adopt the AHP method for the objective of this work, the following steps proposed by Saaty [89] and Carrasco et al. [37] are followed.

Structuring of the Decision Problem into a Hierarchical Model
This consists of the decomposition of the decision problem into elements, according to their common characteristics, visually constructing a hierarchical model of different interrelated criteria, facilitating their understanding and evaluation. The first level always contains the goal of the problem; the second level is constituted by the criteria, which can be subdivided into sub-criteria; and the last level contains different alternatives. Thus, in this step, we define the alternatives set A = {a 1 , . . . , a #A } and the hierarchical criteria for assessing them C. C 1 = {c 11 , . . . , c 1#C1 }, each of these criteria c 1i can, in turn, be subdivided into sub-criteria, at several levels, c 1ij = {c 1i1 , . . . , c 1#Cij }, and thus recursively.

Making Pairwise Comparisons and Obtaining the Judgmental Matrix
In this step, the opinion of the decision makers is used to compare parts of elements of a particular level with respect to a specific element at the immediate superior level. Let PW = (pw ij ) nxn be a pairwise comparison matrix where element pw ij represents the importance of criterion i over criterion j evaluated by the decision makers, which judge the relative importance of one criterion over another with respect to the goal. The relative importance of one sub-criterion over another with respect to the main dimension will also be calculated. Every judgment will be represented from the predefined rating scale of the numbers of Table 1. Each entry a ij of the judgmental matrix is governed by the three rules: pw ij > 0; pw ij = 1/pw ji reciprocal property; and pw ii = 1 for all i.  [89]).

Intensity of Importance Definition Explanation 1
Equal importance Two activities contribute equally to the objective 2 Weak or slight 3 Moderate importance Experience and judgement slightly favour one activity over another 4 Moderate plus

5
Strong importance Experience and judgement strongly favour one activity over another 6 Strong plus 7 Very strong or demonstrated An activity is favoured very strongly over another; its dominance is demons importance trated in practice 8 Very, very strong 9 Extreme importance The evidence favouring one activity over another is of the highest possible order of affirmation

Reciprocals of Above
If activity i has one of the above non-zero numbers assigned to it when compared with activity j, then j has the reciprocal value when compared with i.

1.1-1.9
If the activities are very close It may be difficult to assign the best value, but when compared with other contrasting activities, the size of the small numbers would not be too noticeable, however, they can still indicate the relative importance of the activities.

Obtaining Local Weights and Consistency of Comparisons
The criteria weight vector, w, is built using the eigenvector method through the following equation: where λ max is the is the maximum eigenvalue of PW and w is the normalized eigenvector associated with the main eigenvalue of PW. This approach provides the best priority weights for each criterion or sub-criterion. The consistency of the AHP can be checked by the consistency ratio (CR), which is defined as follows: that is, the division between the consistency index (CI), defined as λ max −n n−1 , and the random consistency index (RI), which represents the consistency of a randomly generated pairwise comparison matrix. Table 2 shows the RI provided by Saaty. If CR ≤ 0.1, the results of the individual hierarchical type are satisfied, and coherence is guaranteed; otherwise, it will be necessary to adjust the values of the elements of the pairwise comparison, and the judgments should be made once again by the decision makers that are more consistent.

Developed Approach
This section explains the proposed model to define the best commercial strategy attributable to the RFM model, customizable by the product catalogue and marketing criteria using the fuzzy linguistic 2-tuple RFM model. The process consists of the following four steps represented in Figure 3. In step 1, we prepare the product hierarchy to be able to introduce the product information into the model. In step 2, we calculate the different weights for the variables and the fuzzy linguistic 2-tuple RFMScore per product category. In step 3, we are able to define customer segments based on the RFMScore by product category, and in the last step, we have all the tools to define the marketing strategy. The scheme shows how our model is able to include purchase and customer databases, historical and current products from a catalogue, business experts' opinions and social events as inputs to define the 2-tuple linguistic model and the RFMScore per product for customer segmentation where product catalogue hierarchies and business criteria are applied to customize the results and adapt them to business needs. weights for the variables and the fuzzy linguistic 2-tuple RFMScore per product category.
In step 3, we are able to define customer segments based on the RFMScore by product category, and in the last step, we have all the tools to define the marketing strategy. The scheme shows how our model is able to include purchase and customer databases, historical and current products from a catalogue, business experts' opinions and social events as inputs to define the 2-tuple linguistic model and the RFMScore per product for customer segmentation where product catalogue hierarchies and business criteria are applied to customize the results and adapt them to business needs.

Product Representation: Step1
Let P = {p1, …, p#P} be a set of a company's products that can be bought, that are currently in use in the company, and let HP = {hp1, …, hp#HP} be the set of a historical company's products, i.e., products that are not possible to buy now as they are out of range.
A company's product portfolio is typically organized into a hierarchy. Therefore, set H = ⋃ # as a product hierarchy, where each L(k) implies a classification of the set P, and the higher the k-level is, the more detailed the classification is. Additionally, each L(k), with k > 1, is subordinate to the L(k − 1).
With this hierarchy, we can represent the usual levels present in the product portfolio Kotler [90], as shown in Figure 4, where L(6) corresponds to the set P and L(7) corresponds to the set HP.

Product Representation: Step 1
Let P = {p 1 , . . . , p #P } be a set of a company's products that can be bought, that are currently in use in the company, and let HP = {hp 1 , . . . , hp #HP } be the set of a historical company's products, i.e., products that are not possible to buy now as they are out of range.
A company's product portfolio is typically organized into a hierarchy. Therefore, set H = ∪ #H k=1 L(k) as a product hierarchy, where each L(k) implies a classification of the set P, and the higher the k-level is, the more detailed the classification is. Additionally, each L(k), with k > 1, is subordinate to the L(k − 1).
With this hierarchy, we can represent the usual levels present in the product portfolio Kotler [90], as shown in Figure 4, where L(6) corresponds to the set P and L(7) corresponds to the set HP. Mathematics 2021, 9, x FOR PEER REVIEW 8 of 32 In any retailer, the range of products is continually renewed; each season, products that are no longer manufactured are replaced by others of better quality or adapted to market trends, etc. As our system will use historical databases of customer purchases, all products included in the catalogue are needed, even if they are not currently in use, i.e., included in the set P. For this reason, we include the last level, HP, where all products that have historically existed in the company would be included. To manage this new level, it is necessary that each product of the previous level, P, is also created in the HP level. In addition, each terminated product is related to a current product. In this way, we can use previous purchases of non-current products to profile current customers.
An example of these kinds of products for our retailer, a leader in home decoration, is the desk we can see in Figure 5. In any retailer, the range of products is continually renewed; each season, products that are no longer manufactured are replaced by others of better quality or adapted to market trends, etc. As our system will use historical databases of customer purchases, all products included in the catalogue are needed, even if they are not currently in use, i.e., included in the set P. For this reason, we include the last level, HP, where all products that have historically existed in the company would be included. To manage this new level, it is necessary that each product of the previous level, P, is also created in the HP level. In addition, each terminated product is related to a current product. In this way, we can use previous purchases of non-current products to profile current customers.
An example of these kinds of products for our retailer, a leader in home decoration, is the desk we can see in Figure 5.
Products from the example perfectly adjust to the product hierarchy representation presented in Let U = {u1, …, u#U} be a company's current customers. In companies that use a relational strategy based on RFM, they often define this set of customers based on past purchases, using a period of analysis that varies according to the type of company. Therefore, customers are included in the analysis if they have made any type of purchase in the defined period.
Similarly, we use a vector model to represent the purchased products. Then, for a customer e, we have a vector, VUe = (VUe1, …, VUe#L(kmax)), where each component VUej represents the purchase's importance for the products of the corresponding category in L(kmax) for the customer ue. The value Ae represents the global RFMScore for that particular customer. Some authors note the importance of using the amount of the purchases in recommendation systems and highlight that this usefulness also depends on the recency of the purchase (Pradel et al. [91]). Generalizing this idea, we calculate the importance of the purchase based on the fuzzy linguistic 2-tuple RFM model shown in Section 2.2. Therefore, where S is defined in Definition 2, which is equivalent to the set S ( Figure 2).
With the aim of calculating this vector, we should follow the next sub-steps.

Obtaining the Weights of Each Product
A retail company usually has a product portfolio, i.e., the set P, composed of a variety of products, some of which are of great importance as they can generate customer loyalty, and others that are considered less important. In addition, to calculate the importance of the purchase, as mentioned above, we use three variables: recency, frequency and monetary. Using the example of frequency, for certain products (e.g., a bed), this frequency is not very important as its life cycle is longer, and we do not need to buy beds continuously. However, for others with a very short life cycle (e.g., scented candles), frequency is fundamental. In order to solve these two issues, i.e., the importance of the product within the

RFM Based on Product Hierarchy: Step 2
Let U = {u 1 , . . . , u #U } be a company's current customers. In companies that use a relational strategy based on RFM, they often define this set of customers based on past purchases, using a period of analysis that varies according to the type of company. Therefore, customers are included in the analysis if they have made any type of purchase in the defined period.
Similarly, we use a vector model to represent the purchased products. Then, for a customer e, we have a vector, VU e = (VU e1 , . . . , VU e#L(kmax) ), where each component VU ej represents the purchase's importance for the products of the corresponding category in L(kmax) for the customer u e . The value A e represents the global RFMScore for that particular customer. Some authors note the importance of using the amount of the purchases in recommendation systems and highlight that this usefulness also depends on the recency of the purchase (Pradel et al. [91]). Generalizing this idea, we calculate the importance of the purchase based on the fuzzy linguistic 2-tuple RFM model shown in Section 2.2. Therefore, each VU ej ∈ S × [−0.5, 0.5), where S is defined in Definition 2, which is equivalent to the set S ( Figure 2).
With the aim of calculating this vector, we should follow the next sub-steps.

Obtaining the Weights of Each Product
A retail company usually has a product portfolio, i.e., the set P, composed of a variety of products, some of which are of great importance as they can generate customer loyalty, and others that are considered less important. In addition, to calculate the importance of the purchase, as mentioned above, we use three variables: recency, frequency and monetary. Using the example of frequency, for certain products (e.g., a bed), this frequency is not very important as its life cycle is longer, and we do not need to buy beds continuously. However, for others with a very short life cycle (e.g., scented candles), frequency is fundamental. In order to solve these two issues, i.e., the importance of the product within the portfolio and the importance of the RFM variables for these products, we propose to use the AHP introduced in Section 2.2.3. We follow the typical phases of this process.

1.
Structuring of the decision problem into a hierarchical model In order to structure the MCDM, it is necessary to define the available alternatives and the required criteria. The alternatives are the RFM variables: The aim is to obtain the importance of each of these variables for each of the products in the catalogue, historical or otherwise, i.e., P and HP. Therefore, the criteria could be the P set of products in use. This would make the problem unmanageable, due to their high number. Fortunately, companies usually have a well-structured product catalogue, as seen in Section 3.1. To define the criteria, we use a portion of the hierarchical portfolio, H, defined in the section.
The value of C = ∪ kmax k=1 L(k), with kmax ∈ {2, . . . , 5}, indicates the maximum level of detail in the portfolio where the importance of the products, as well as the importance for the evaluation of the RFM variables, can be determined. Figure 6 shows the final hierarchy of the proposed AHP model. portfolio and the importance of the RFM variables for these products, we propose to use the AHP introduced in Section 2.2.3. We follow the typical phases of this process.

Structuring of the decision problem into a hierarchical model
In order to structure the MCDM, it is necessary to define the available alternatives and the required criteria. The alternatives are the RFM variables: The aim is to obtain the importance of each of these variables for each of the products in the catalogue, historical or otherwise, i.e., P and HP. Therefore, the criteria could be the P set of products in use. This would make the problem unmanageable, due to their high number. Fortunately, companies usually have a well-structured product catalogue, as seen in Section 3.1. To define the criteria, we use a portion of the hierarchical portfolio, H, defined in the section.
The value of C = ⋃ , with kmax ∈ {2, …, 5}, indicates the maximum level of detail in the portfolio where the importance of the products, as well as the importance for the evaluation of the RFM variables, can be determined. Figure 6 shows the final hierarchy of the proposed AHP model.

Making pairwise comparison
The marketing experts fill in the different pairwise matrices corresponding to the criteria of the hierarchical model C, expressing the relative importance of some categories over others in order to assess the customers' purchases. Furthermore, for each of the L(kmax) elements, the importance of each of the alternatives is evaluated, i.e., of the three RFM variables, generating the corresponding pairwise matrices.

Obtaining local weights and consistency of comparisons
In order to ensure the coherence of the given matrices, their CR (Equation (9)) has to be lower than or equal to 0.1. If the CR is not good enough, it will be considered that the business specifications do not meet their quality criteria, i.e., they may contradict each other. Therefore, it is necessary that the pairwise comparison matrices are revised to improve their consistency ratio.
Once the consistency of the matrices has been checked, the weight of each criterion and sub-criterion is calculated. The local weights of the lower level of the criterion (more granular level within the product portfolio chosen in the first step) for each of the RFM Figure 6. AHP hierarchy.

2.
Making pairwise comparison The marketing experts fill in the different pairwise matrices corresponding to the criteria of the hierarchical model C, expressing the relative importance of some categories over others in order to assess the customers' purchases. Furthermore, for each of the L(kmax) elements, the importance of each of the alternatives is evaluated, i.e., of the three RFM variables, generating the corresponding pairwise matrices.

3.
Obtaining local weights and consistency of comparisons In order to ensure the coherence of the given matrices, their CR (Equation (9)) has to be lower than or equal to 0.1. If the CR is not good enough, it will be considered that the business specifications do not meet their quality criteria, i.e., they may contradict each other. Therefore, it is necessary that the pairwise comparison matrices are revised to improve their consistency ratio.
Once the consistency of the matrices has been checked, the weight of each criterion and sub-criterion is calculated. The local weights of the lower level of the criterion (more granular level within the product portfolio chosen in the first step) for each of the RFM alternatives are expressed as follows: w R = (w R1 , . . . , w R#L(kmax) ), w F = (w F1 , . . . , w F#L(kmax) ) and w M = (w M1 , . . . , w M#L(kmax) ). Finally, from these local weights, using the hierarchical structure, we obtain the weights W RFM = {W R , W F , W M }.

Obtaining the Fuzzy Linguistic 2-Tuple RFM Value for Each Customer and Each Product
In this step, we apply the fuzzy linguistic 2-tuple RFM model for each customer and for each of the product categories L(kmax) from which the customer has purchased during the chosen analysis period. Therefore, we follow the step explained in Section 2.2. individually for each historical product obtaining its corresponding category in L(kmax) and the corresponding VU e value, which would give us the global RFMScore for each of the categories of that level. From this vector, we can obtain the value A e that represents the global RFM value for the client using the W RFM weight matrix by means of the operator A w (Equation (4)).

Customer Segmentation by 2-Tuple RFM Value per Product: Step 3
The RFMScores per product obtained in 3.2.2 (VU e ) are used to define clusters of customers with the same patterns. There are many clustering algorithms for this, but the RFM model works well with the k-means algorithm. The main objective is to obtain k clusters C HP = {C HP1 , . . . , C HPk } with their correspondents k centroids, vs. = (v s1 , v s2 , . . . , v s#(Lmax) ), which is the s = 1 . . . k, one for each cluster. The values of these centroids could be expressed using the model fuzzy linguistic 2-tuple model; thus, we achieve a better linguistic interpretability.

Strategy by Segment under Business and Product Preferences: Step 4
Once we defined the set of clusters C HP , different strategies should be designed to match business needs with customers' needs and therefore, customers' lifecycles will be longer and business will consequently improve.

Case of the Model Implementation
This work was elaborated with a real transactional dataset from an online and offline retailer, a worldwide leader in home furnishing and decoration.
The dataset contains more than 25 million ticket lines concerning purchases from May 2014 to May 2020.
The most common situation for retailers is to not have access to socio-demographical information about their customers, or in the case they have it, data are usually out of date as no one remembers to update their own data when something changes. Therefore, the only information about customers we used for this investigation was historical transactional data, which will ensure the usefulness of the experiment for other retailers as they will also have the same details. As detailed in Section 3.1, data will be detailed at the L(7) product level, which means that the dataset will include historical products, HP, and current products, P.
Data were analysed using the Knime Analytics Platform, intuitive, low code and open source software for creating data science. Our intention was to help other researchers and business professionals to understand their data and define machine learning models accessible for everyone.
In this work, we solved a real business problem. Business experts were involved to make decisions and ensure the obtained results make sense for real situations.
Following the scheme shown in Figure 3, each step of the scheme is detailed.

Step 1: Product Hierarchy Definition
Retailers have a huge number of products, which are organised into a structure in order to make them manageable. This classifies a company's products and services by their essential components into a logical structure. The product hierarchy is defined to answer business needs, but customers' purchasing behaviour does not have to follow that structure.
Observing the definition of the product catalogue made previously, and the example already introduced in Section 3.1, the current catalogue of the company, with its HP and P products, adapts perfectly to the hierarchy's defined scheme.
To bring to light the products' hierarchy defined based on customer purchase patterns, we define a new hierarchy, denoted as H, and we carried out a principal component analysis of customer transactional data.
Starting from the customer's purchase ticket details, we must create a new dataset in which each client will be classified following the new hierarchy, which will be defined by analysing customer behaviour patterns.
In the first step, KMO (Kaiser-Meyer-Olkin) and Bartlett's (BTS) tests were applied in order to check whether there was a certain redundancy between the variables that can be summarized with a few factors. These new factors define the product hierarchy that is hidden in customer buying patterns. Both tests were calculated by comparing the observed correlation matrix to the identity matrix. The KMO test was associated with the degree of common variance. Bartlett's test determines whether the correlation matrix is an identity matrix (Ocal et al. [92]; Hair et al. [93]; Ali et al. [94]).
Bartlett's test resulted in a p-value of 0.0, so we concluded that the correlation matrix between the original product areas is not the identity matrix, and the KMO test resulted in an overall MSA of 0.98, so it was clear that we could reduce the original structure into a new, more manageable one, which will form a new product hierarchy that will reflect the way customers make purchases. Figure 7 shows the new aggregation. business needs, but customers' purchasing behaviour does not have to follow that struc-ture.
Observing the definition of the product catalogue made previously, and the example already introduced in Section 3.1, the current catalogue of the company, with its HP and P products, adapts perfectly to the hierarchy's defined scheme.
To bring to light the products' hierarchy defined based on customer purchase patterns, we define a new hierarchy, denoted as H, and we carried out a principal component analysis of customer transactional data.
Starting from the customer's purchase ticket details, we must create a new dataset in which each client will be classified following the new hierarchy, which will be defined by analysing customer behaviour patterns.
In the first step, KMO (Kaiser-Meyer-Olkin) and Bartlett's (BTS) tests were applied in order to check whether there was a certain redundancy between the variables that can be summarized with a few factors. These new factors define the product hierarchy that is hidden in customer buying patterns. Both tests were calculated by comparing the observed correlation matrix to the identity matrix. The KMO test was associated with the degree of common variance. Bartlett's test determines whether the correlation matrix is an identity matrix (Ocal et al. [92]; Hair et al. [93]; Ali et al. [94]).
Bartlett's test resulted in a p-value of 0.0, so we concluded that the correlation matrix between the original product areas is not the identity matrix, and the KMO test resulted in an overall MSA of 0.98, so it was clear that we could reduce the original structure into a new, more manageable one, which will form a new product hierarchy that will reflect the way customers make purchases. Figure 7 shows the new aggregation. All main furniture areas were strongly correlated with accessories, i.e., when a customer buys a bed and mattress, they also buy accessories, such as pillows, bedlinen and cushions. To differentiate main furniture from these "easy to buy accessories", it was decided to create an "artificial" dimension only for accessories that are purchased in a very different way, and this was called the impulsive products category.
Critic products include all furniture products with a very long decision journey as they are more expensive, difficult to buy and designed by the client and they imply great trust All main furniture areas were strongly correlated with accessories, i.e., when a customer buys a bed and mattress, they also buy accessories, such as pillows, bedlinen and cushions. To differentiate main furniture from these "easy to buy accessories", it was decided to create an "artificial" dimension only for accessories that are purchased in a very different way, and this was called the impulsive products category.
Critic products include all furniture products with a very long decision journey as they are more expensive, difficult to buy and designed by the client and they imply great trust in the brand. These products create loyal customers as once the customer buys one of these critic products, the brand associated with these products will always be present in the customer's life.
Reflexive products were an isolated group of products that are important, but not as difficult to purchase as the critic ones.
Evolving products were products related to children. They will change as children grow up.
Seasonal products also became isolated, and this is an interesting category as it will be very useful to generate a sensation of novelty and create traffic to the stores seasonally for all kinds of customers.

Step 2: RFMScore Definition Based on Product Hierarchy
The dataset used in this step contains all the historical purchase information related to 219.199 (#U) customers with more than 25 million ticket lines concerning purchases from May 2014 to May 2020.
As retailers do not usually have socio-demographical information about their customers, they need to find a new way to learn more about their clients to support their business decisions. Figure 8 shows an example of the customer information available on any retailer. in the brand. These products create loyal customers as once the customer buys one of these critic products, the brand associated with these products will always be present in the customer's life.
Reflexive products were an isolated group of products that are important, but not as difficult to purchase as the critic ones.
Evolving products were products related to children. They will change as children grow up.
Seasonal products also became isolated, and this is an interesting category as it will be very useful to generate a sensation of novelty and create traffic to the stores seasonally for all kinds of customers.

Step 2: RFMScore Definition Based on Product Hierarchy
The dataset used in this step contains all the historical purchase information related to 219.199 (#U) customers with more than 25 million ticket lines concerning purchases from May 2014 to May 2020.
As retailers do not usually have socio-demographical information about their customers, they need to find a new way to learn more about their clients to support their business decisions. Figure 8 shows an example of the customer information available on any retailer. The information available is the customer ID, the date of the purchase, the product identification number and the amount paid for each product that each customer has purchased on each visit.

Step 2.1: Obtaining the RFM Weights for Each Historical Product
To facilitate the understanding of the following sections, we enumerated the different tasks that each one of them encompasses.

Structuring the decision problem into a hierarchical model
In Section 4.1, step 1, we proceeded to define a more suitable catalogue H for marketing decisions; these specifications can be carried out only using levels 1 and 2, i.e., kmin = kmax = 2, which indicates that C = L(2) with #L(2) = 5, C = {Critic, Reflexive, Evolving, Seasonal, Impulsive}. The hierarchical AHP model is shown in Figure 9. The information available is the customer ID, the date of the purchase, the product identification number and the amount paid for each product that each customer has purchased on each visit.

Step 2.1: Obtaining the RFM Weights for Each Historical Product
To facilitate the understanding of the following sections, we enumerated the different tasks that each one of them encompasses.

1.
Structuring the decision problem into a hierarchical model In Section 4.1, step 1, we proceeded to define a more suitable catalogue H for marketing decisions; these specifications can be carried out only using levels 1 and 2, i.e., kmin = kmax = 2, which indicates that C = L(2) with #L(2) = 5, C = {Critic, Reflexive, Evolving, Seasonal, Impulsive}.
The hierarchical AHP model is shown in Figure 9. Mathematics 2021, 9, x FOR PEER REVIEW 8 of 32 Figure 9. AHP model.
The main objective is to define the importance of each alternative (recency, frequency and monetary) based on the underlying information found on customers' product purchases.

Making pairwise comparisons
Marketing experts collaborated in this step. Some questionnaires were prepared in order to help them to fulfil the different pairwise matrices corresponding to the criteria of the hierarchical model described in Figure 9. Experts expressed the importance of some categories over others, taking into account what customers have bought but also introducing the preferences of the business into this judgment.
The first pairwise matrix compares the five criteria; Table 3 represents the pairwise matrix. We can observe how business experts gave more importance to critic products than any other product category. Critic products are very complex products with a long purchase journey; therefore, when a customer acquires them, they will remain engaged with this brand for a long period of time. The second category was reflexive products. The products that are summarized in this category have a simpler purchasing process, and yet they generate many sales for the business in addition to being important to achieve a comfortable bedroom atmosphere. Evolving and impulsive products follow in the level of importance. Evolving products are related to children. Families with children will need, over time, to change furniture and decorations as their children grow up; they are, therefore, a key customer segment for the business as this type will remain linked to the brand longer than any other. Impulsive products are important as they work very well as traffic generators and as products to engage all kind of customers, because they do not need a long purchase decision process and are accessible for everyone. The less important category was seasonal; products belonging to this category are useful for creating a novelty feeling and catalogue refreshment but are smaller in sales that any other category. The main objective is to define the importance of each alternative (recency, frequency and monetary) based on the underlying information found on customers' product purchases.

Making pairwise comparisons
Marketing experts collaborated in this step. Some questionnaires were prepared in order to help them to fulfil the different pairwise matrices corresponding to the criteria of the hierarchical model described in Figure 9. Experts expressed the importance of some categories over others, taking into account what customers have bought but also introducing the preferences of the business into this judgment.
The first pairwise matrix compares the five criteria; Table 3 represents the pairwise matrix. We can observe how business experts gave more importance to critic products than any other product category. Critic products are very complex products with a long purchase journey; therefore, when a customer acquires them, they will remain engaged with this brand for a long period of time. The second category was reflexive products. The products that are summarized in this category have a simpler purchasing process, and yet they generate many sales for the business in addition to being important to achieve a comfortable bedroom atmosphere. Evolving and impulsive products follow in the level of importance. Evolving products are related to children. Families with children will need, over time, to change furniture and decorations as their children grow up; they are, therefore, a key customer segment for the business as this type will remain linked to the brand longer than any other. Impulsive products are important as they work very well as traffic generators and as products to engage all kind of customers, because they do not need a long purchase decision process and are accessible for everyone. The less important category was seasonal; products belonging to this category are useful for creating a novelty feeling and catalogue refreshment but are smaller in sales that any other category.
After the first pairwise matrix was completed, it was necessary to evaluate all criteria against the three alternatives. Figure 10 shows the pairwise matrices for this process.
Mathematics 2021, 9, x FOR PEER REVIEW 8 of 32 After the first pairwise matrix was completed, it was necessary to evaluate all criteria against the three alternatives. Figure 10 shows the pairwise matrices for this process. It is easy to observe how for critic products, the most important alternative is clearly monetary as they are products that customers purchase only once and spend a large amount of time and money on. Reflexive products involve the same situation but weaker, and evolving products invert the order and are more important for frequency than for recency, monetary always being the most important. Seasonal products have a totally different balance as they are products that appear seasonally in the catalogue. For them, recency is not important, but frequency and monetary have the same weights. Impulsive products, due to their own definition, should have high importance for recency and frequency and low importance for monetary.

Obtaining local weights and consistency of matrix comparisons
Subsequently, all matrices were defined, and their consistency was checked. Table 4 includes the CR for all of our matrices.  It is easy to observe how for critic products, the most important alternative is clearly monetary as they are products that customers purchase only once and spend a large amount of time and money on. Reflexive products involve the same situation but weaker, and evolving products invert the order and are more important for frequency than for recency, monetary always being the most important. Seasonal products have a totally different balance as they are products that appear seasonally in the catalogue. For them, recency is not important, but frequency and monetary have the same weights. Impulsive products, due to their own definition, should have high importance for recency and frequency and low importance for monetary.

3.
Obtaining local weights and consistency of matrix comparisons Subsequently, all matrices were defined, and their consistency was checked. Table 4 includes the CR for all of our matrices.
Once the consistency was checked, the weight of each criterion and sub-criterion was calculated as we defined in Equation (8). Table 5 shows the eigenvector of matrix H, which can be understood as the weights for the five different criteria, and also the five eigenvectors for the matrix of critic products vs. R, F and M alternatives; the matrix of reflexive products vs. R, F and M alternatives; the matrix of evolving products vs. R, F and M alternatives; the matrix of seasonal products vs. R, F and M alternatives; and the matrix of impulsive products vs. R, F and M alternatives. Once we defined the local weights, we were able to calculate the final weights for the R, F and M alternatives by multiplying both tables, therefore we extract the final vector (w R , w F , w M ) as shown in Equation (10). Therefore, w R = 0.204, w F = 0.186 and w M = 0.609. This allowed us to determine the importance of each alternative for this company-approximately 61% for monetary, 19% for frequency and 20% for recency. It is remarkable that these local weights could be changed to adapt to business needs, which transforms this method into an important tool for marketers. With this methodology, companies will be able to adapt their preferences (weights of products) to support business needs following the commercial calendar or marketing actions. This will ensure marketing campaign success. If they have, during a certain period of time, a focus on a particular area, for example, a living room, they will have a tool to reinforce those areas working with the pairwise comparison matrices, changing the weights for each area and consequently each product will inherit the weights from their hierarchy.
Step 2.2: Obtaining the 2-Tuple RFM Value for Each User and Product These calculations are stored in the following vector for each U e customer: VU e = (VU e1 , . . . , VU e#P ), where VU ei ∈ S 1 × [−0.5, 0.5) represents the linguistic 2-tuple importance value of the product p i for the customer U e . If the customer had never purchased that product (during the period), it would have a 0 value or the tuple (VB, 0.0). Figure 11 shows a sample of customers with fuzzy linguistic 2-tuple RFMScores per product area.
We can see how each customer was classified in terms of their RFMScore per product hierarchy. For example, the first customer (with CustomerID = 0223) is a bad customer with a negative alpha for critic products and very bad in reflexive, evolving, seasonal and impulsive products. The third customer (CustomerID = 00193) is very good in critic, evolving and impulsive products but not in seasonal and reflexive products. Therefore, here we discover a way to keep this customer "alive", which is offering them the seasonal collections four times per year or to redecorate their bedroom with new small furniture and bed textiles.
working with the pairwise comparison matrices, changing the weights for each area and consequently each product will inherit the weights from their hierarchy.
Step 2.2: Obtaining the 2-Tuple RFM Value for Each User and Product These calculations are stored in the following vector for each Ue customer: VUe = (VUe1, …, VUe#P), where VUei ∈ S1 × [−0.5, 0.5) represents the linguistic 2-tuple importance value of the product pi for the customer Ue. If the customer had never purchased that product (during the period), it would have a 0 value or the tuple (VB, 0.0). Figure 11 shows a sample of customers with fuzzy linguistic 2-tuple RFMScores per product area.  The last column of this table includes the global fuzzy linguistic 2-tuple RFMScore per product, which offers a general view of the customer value where the different product areas have been taken into account to define the A e value. Following our example, we can see how customer "02330" is a bad customer reinforced with a negative alpha value and customer "00193" is a very good one, but as they have VB values in reflexive and seasonal, the A 00193 = (VG, −0.054), i.e., they are a very good customer but with a negative alpha, which indicates that they can still improve.

Step 3: Clustering Customers Based on RFMScore for Each Product Hierarchy
Once we classified every customer in terms of fuzzy linguistic 2-tuple RFMScore per product area, i.e., we calculated the VU e vector, we can move to the next step and try to define groups of customers with similar profiles based on these RFMScores. When a customer has a high value in one product hierarchy, this means that they are a very good customer for that aggregation of products; therefore, we will be able to define a customized strategy based on their characteristics. The best way to achieve a global picture of our customers to properly define the correct actions for each one is to launch a segmentation using the RFMScores calculated with product hierarchies and weights defined by business experts.
The first step was to define the optimal number of clusters, which was calculated based on the within-cluster-sum-of-squares (WCSS) method, also taking into account the expert interpretation of customers, so it was decided to take five as the optimal number of clusters. Figure 12 shows the elbow graph where we can see how k = 5 fits perfectly with this decision. We can see how each customer was classified in terms of their RFMScore per product hierarchy. For example, the first customer (with CustomerID = 0223) is a bad customer with a negative alpha for critic products and very bad in reflexive, evolving, seasonal and impulsive products. The third customer (CustomerID = 00193) is very good in critic, evolving and impulsive products but not in seasonal and reflexive products. Therefore, here we discover a way to keep this customer "alive", which is offering them the seasonal collections four times per year or to redecorate their bedroom with new small furniture and bed textiles.
The last column of this table includes the global fuzzy linguistic 2-tuple RFMScore per product, which offers a general view of the customer value where the different product areas have been taken into account to define the Ae value. Following our example, we can see how customer "02330" is a bad customer reinforced with a negative alpha value and customer "00193" is a very good one, but as they have VB values in reflexive and seasonal, the A00193 = (VG, −0.054), i.e., they are a very good customer but with a negative alpha, which indicates that they can still improve.

Step 3: Clustering Customers Based on RFMScore for Each Product Hierarchy
Once we classified every customer in terms of fuzzy linguistic 2-tuple RFMScore per product area, i.e., we calculated the VUe vector, we can move to the next step and try to define groups of customers with similar profiles based on these RFMScores. When a customer has a high value in one product hierarchy, this means that they are a very good customer for that aggregation of products; therefore, we will be able to define a customized strategy based on their characteristics. The best way to achieve a global picture of our customers to properly define the correct actions for each one is to launch a segmentation using the RFMScores calculated with product hierarchies and weights defined by business experts.
The first step was to define the optimal number of clusters, which was calculated based on the within-cluster-sum-of-squares (WCSS) method, also taking into account the expert interpretation of customers, so it was decided to take five as the optimal number of clusters. Figure 12 shows the elbow graph where we can see how k = 5 fits perfectly with this decision. Clustering was performed using the k-means algorithm. Variables used to define customer clusters were the fuzzy linguistic 2-tuple RFMScore per each product hierarchy. Figure 13 shows the radar plot for each cluster. Clustering was performed using the k-means algorithm. Variables used to define customer clusters were the fuzzy linguistic 2-tuple RFMScore per each product hierarchy. Figure 13 shows the radar plot for each cluster.  Once clusters CHP were defined and described, we classified the full customer database in terms of their historical purchases; therefore, we were also able to define other areas with potential for each customer. It is important to remark the fact that all this information was acquired only using the historical purchase database.

Step 4: Strategy by Segment under Business and Product Preferences
At this point, we developed our proposal to be able to improve strategic marketing Once clusters C HP were defined and described, we classified the full customer database in terms of their historical purchases; therefore, we were also able to define other areas with potential for each customer. It is important to remark the fact that all this information was acquired only using the historical purchase database.

Step 4: Strategy by Segment under Business and Product Preferences
At this point, we developed our proposal to be able to improve strategic marketing decisions based on the fuzzy linguistic 2-tuple RFM model per product hierarchy. The linguistic labels offer better interpretability to better understand the data. It is remarkable that local weights defined per product category could be changed to follow business needs, which transforms this method into an important tool for marketers. With this methodology, companies will be able to adapt their preferences (weights of products) to support business needs following the commercial calendar or marketing actions.

Discussion
We defined a method that helps retailers better approach their customers by analysing the only data that all of them have: the ticket line details.
As we already mentioned, our model is based on the fuzzy linguistic 2-tuple approach that carries out processes of "computing with words" without a loss of information and offers more accurate linguistic interpretability.
For the sake of seeing the new contributions of this work, we compare the results with the global fuzzy linguistic 2-tuple RFM model presented by Carrasco et al. [36]. Despite not being the focus of our research, we consider it important to show the calculation of the global fuzzy linguistic 2-tuple RFM model with our dataset in order to better understand the contributions of our approach. To facilitate the interpretability of the full process, we indicated each different step of the followed method.
We will work again with the same dataset containing all the details of ticket lines per customer. Figure 14 shows the output table from Knime after calculating the first three variables, the traditional recency, frequency and monetary.

Discussion
We defined a method that helps retailers better approach their customers by analysing the only data that all of them have: the ticket line details.
As we already mentioned, our model is based on the fuzzy linguistic 2-tuple approach that carries out processes of "computing with words" without a loss of information and offers more accurate linguistic interpretability.
For the sake of seeing the new contributions of this work, we compare the results with the global fuzzy linguistic 2-tuple RFM model presented by Carrasco et al. [36]. Despite not being the focus of our research, we consider it important to show the calculation of the global fuzzy linguistic 2-tuple RFM model with our dataset in order to better understand the contributions of our approach. To facilitate the interpretability of the full process, we indicated each different step of the followed method.

Defining recency, frequency and monetary global variables.
We will work again with the same dataset containing all the details of ticket lines per customer. Figure 14 shows the output table from Knime after calculating the first three variables, the traditional recency, frequency and monetary. As observed, when we calculated the RFMScore by product hierarchy, the first customer "02330" has bad recency, very low frequency and low monetary value. Customer number "00193" has very good recency, good frequency and monetary value. This customer was classified as very good for critic, evolving and impulsive hierarchies and they were very bad for reflexive and seasonal products. They belonged to CHP4, which has been labelled as "High potential for Reflexive and Seasonal"; therefore, here, we have a tool to engage and develop this customer by offering them seasonal and reflexive products to maintain the customer's engagement and enlarge their customer lifetime. Working with only ticket details, As observed, when we calculated the RFMScore by product hierarchy, the first customer "02330" has bad recency, very low frequency and low monetary value. Customer number "00193" has very good recency, good frequency and monetary value. This cus-tomer was classified as very good for critic, evolving and impulsive hierarchies and they were very bad for reflexive and seasonal products. They belonged to C HP4 , which has been labelled as "High potential for Reflexive and Seasonal"; therefore, here, we have a tool to engage and develop this customer by offering them seasonal and reflexive products to maintain the customer's engagement and enlarge their customer lifetime. Working with only ticket details, we were able to develop a tool to better known customers, but we do not know right now how active they are, which means, on top of having good or bad behaviour in terms of product hierarchies, the question that arises is: is this customer active or not?
Working with the three variables shown in Figure 14, we were able to calculate the fuzzy RFM model. Figure 15 shows the new variables in terms of fuzzy linguistic 2-tuple. Working with the three variables shown in Figure 14, we were able to calculate the fuzzy RFM model. Figure 15 shows the new variables in terms of fuzzy linguistic 2-tuple. Again, we can follow our known customers and see how customer "02330" is very bad in recency and frequency and a bad customer in monetary values. Customer "00193" is a good customer with a positive alpha parameter for recency and a very good one for the other categories but with a negative alpha parameter, which means that this customer is very good but still has space to improve.
In order to be able to aggregate the three fuzzy variables and obtain a unique score for each customer, we can also calculate the fuzzy linguistic 2-tuple RFMScore by assigning weights to each dimension. In this case, experts found it more difficult to find the perfect weight for recency, frequency and monetary as they wanted to increase all three variables, so they decided to balance them, assigning one third for each one, indicating vector W = (1/3, 1/3, 1/3). With no extra information and after working with the possibility to prioritize per product area, it remains difficult for them to rank the three variables. We calculated the RFM-Score for each customer as RFMDScore = RScore × 1/3 + FScore × 1/3 + MScore × 1/3, and once we have the RFMScore, we can also translate it into a fuzzy linguistic 2-tuple format. Figure  16 shows our sample of customers with these new variables calculated. Again, we can follow our known customers and see how customer "02330" is very bad in recency and frequency and a bad customer in monetary values. Customer "00193" is a good customer with a positive alpha parameter for recency and a very good one for the other categories but with a negative alpha parameter, which means that this customer is very good but still has space to improve.
In order to be able to aggregate the three fuzzy variables and obtain a unique score for each customer, we can also calculate the fuzzy linguistic 2-tuple RFMScore by assigning weights to each dimension. In this case, experts found it more difficult to find the perfect weight for recency, frequency and monetary as they wanted to increase all three variables, so they decided to balance them, assigning one third for each one, indicating vector W = (1/3, 1/3, 1/3). With no extra information and after working with the possibility to prioritize per product area, it remains difficult for them to rank the three variables. We calculated the RFMScore for each customer as RFMDScore = RScore × 1/3 + FScore × 1/3 + MScore × 1/3, and once we have the RFMScore, we can also translate it into a fuzzy linguistic 2-tuple format. Figure 16 shows our sample of customers with these new variables calculated.
weight for recency, frequency and monetary as they wanted to increase all three variables, so they decided to balance them, assigning one third for each one, indicating vector W = (1/3, 1/3, 1/3). With no extra information and after working with the possibility to prioritize per product area, it remains difficult for them to rank the three variables. We calculated the RFM-Score for each customer as RFMDScore = RScore × 1/3 + FScore × 1/3 + MScore × 1/3, and once we have the RFMScore, we can also translate it into a fuzzy linguistic 2-tuple format. Figure  16 shows our sample of customers with these new variables calculated. Following our customers used as an example, "02330" is a very bad customer in terms of global 2-tuple RFMScore as they have a 0.1 value for his global RFMScore (which has been standardized into 0-1 values). Additionally, customer "00193", having a RFMscore of 0.918, has a value of very good with a negative alpha in the 2-tuple format.
Once we prepared all variables, we can complete the RFMScore and fuzzy linguistic 2-tuple RFMScore with the customer segmentation based on the three variables, fuzzy linguistic 2-tuple recency, frequency and monetary.
Working with business experts, after observing the elbow graph, we decided to define four clusters to segment our customer database. Figure 17 shows the new elbow graph, in this case calculated for the three variables mentioned above. Following our customers used as an example, "02330" is a very bad customer in terms of global 2-tuple RFMScore as they have a 0.1 value for his global RFMScore (which has been standardized into 0-1 values). Additionally, customer "00193", having a RFMscore of 0.918, has a value of very good with a negative alpha in the 2-tuple format.

Clustering customers based on global RFMScore.
Once we prepared all variables, we can complete the RFMScore and fuzzy linguistic 2-tuple RFMScore with the customer segmentation based on the three variables, fuzzy linguistic 2-tuple recency, frequency and monetary.
Working with business experts, after observing the elbow graph, we decided to define four clusters to segment our customer database. Figure 17 shows the new elbow graph, in this case calculated for the three variables mentioned above. Clustering was performed using the k-means algorithm. In this case, the variables used to define customer clusters were the fuzzy linguistic 2-tuple recency, frequency and monetary. The process uncovers four clusters called Ci. Figure 18 shows the radar plots for each cluster. Clustering was performed using the k-means algorithm. In this case, the variables used to define customer clusters were the fuzzy linguistic 2-tuple recency, frequency and monetary. The process uncovers four clusters called C i . Figure 18 shows the radar plots for each cluster. Figure 17. Elbow graph to decide the optimal number of clusters.
Clustering was performed using the k-means algorithm. In this case, the variables used to define customer clusters were the fuzzy linguistic 2-tuple recency, frequency and monetary. The process uncovers four clusters called Ci. Figure 18 shows the radar plots for each cluster.  The radar plots effectively describe each cluster. In this case, we only have three variables; we can see in Figure 18a how all variables have very high values, which is why we labelled this cluster as top. Figure 18b shows a group of customers with bad recency but relatively good monetary and frequency value. This cluster has customers that once were good customers, but they are abandoning the brand, which is why we labelled the cluster as churn. Figure 18c shows the cluster where the worse customers are grouped; the three variables have very low values, so we labelled the cluster as worse. Figure 18d shows a group of customers with very good recency but low values for frequency and monetary; these customers seem to be new customers; therefore, they are starting their purchases, and they will improve in time, which is why we labelled this cluster as new.

Comparing and Enriching Results
The global RFM model offers marketers a good view of customer activity but not information about products. The RFMScore per product hierarchy and the consequent segmentation classifies the customer database into different groups that offer marketers and businesspeople the possibility to push products following the business needs, and consequently, to develop custom strategies adjusted to the customers groups.
As shown in Table 6, the concatenation of both models in the cross table helps to see how coherent the results are. Top customers in terms of products are mainly grouped into top customers in terms of activity (RFM model), which means that the best products create loyal customers with a high level of activity in terms of recency, frequency and monetary. Customers in the group labelled as low in all categories are grouped into worse customers (low recency, low frequency and low monetary) or new customers (good recency but low frequency and monetary). On the other hand, if we focus on the vertical component of the table, we see how customers labelled as churn in terms of RFM are behaving in terms of The radar plots effectively describe each cluster. In this case, we only have three variables; we can see in Figure 18a how all variables have very high values, which is why we labelled this cluster as top. Figure 18b shows a group of customers with bad recency but relatively good monetary and frequency value. This cluster has customers that once were good customers, but they are abandoning the brand, which is why we labelled the cluster as churn. Figure 18c shows the cluster where the worse customers are grouped; the three variables have very low values, so we labelled the cluster as worse. Figure 18d shows a group of customers with very good recency but low values for frequency and monetary; these customers seem to be new customers; therefore, they are starting their purchases, and they will improve in time, which is why we labelled this cluster as new.

Comparing and Enriching Results
The global RFM model offers marketers a good view of customer activity but not information about products. The RFMScore per product hierarchy and the consequent segmentation classifies the customer database into different groups that offer marketers and businesspeople the possibility to push products following the business needs, and consequently, to develop custom strategies adjusted to the customers groups.
As shown in Table 6, the concatenation of both models in the cross table helps to see how coherent the results are. Top customers in terms of products are mainly grouped into top customers in terms of activity (RFM model), which means that the best products create loyal customers with a high level of activity in terms of recency, frequency and monetary.
Customers in the group labelled as low in all categories are grouped into worse customers (low recency, low frequency and low monetary) or new customers (good recency but low frequency and monetary). On the other hand, if we focus on the vertical component of the table, we see how customers labelled as churn in terms of RFM are behaving in terms of product hierarchies, and we have a great tool to try to reactivate the most interesting ones.
Finally, Table 7 shows the cross table for the linguistic labels assigned to customers after the calculation of the 2-tuple RFMScore per product hierarchy with the different weights from the AHP model, (60% for monetary, 18% for frequency and 20% for recency), as we defined in 3.3.2.1 and labels assigned to customers coming from the fuzzy global RFMScore were expert assigned 33% for their recency value, 33% for their frequency value and 33% for their monetary value.
Both models are necessary and complement each other. The customer who is VG in terms of product may not be VG in terms of general recency, frequency and monetary and vice versa.
The other result of our work was a very complete customer segmentation that enriched the clusters obtained with the traditional fuzzy linguistic 2-tuple RFM model and offers a clear view of customer preferences and possible actions to define cross and up-selling strategies as well as adapt communication to follow the customer life-cycle.
We detected some limitations to our approach that open interesting areas for future research. The first one is related to the geospatial information that was not included in the model. We suspect that the geolocation of customers may be directly affecting their preferences and shopping patterns. This information was not entered into the model and could be a very interesting area to continue our work. This point becomes especially relevant if we take into account that online sales have opened doors to the world for any small business that wants to sell through the Internet. We also think that the seasonality of the data could also be affecting the results, so exploring the inclusion of seasonality factors could further improve the results of the model. Another possible improvement to our theoretical model could be the generalization to a multi-hierarchical fuzzy 2-tuple model; as Cid-López et al. [95] stated, it will offer a richer interpretation of the result. Other possible ways to improve our findings will be the inclusion of new variables into the model, such as the periodicity or cadence of customer purchases (Peker et al. [50]). It will also be interesting to apply this model to other businesses to enrich results.

Conclusions
This work improves the fuzzy linguistic 2-tuple RFM model by including product information in the model to calculate the RFMScore for each of the products that a customer has bought. As the number of references could be huge, we described, thanks to PCA, the product hierarchy defined by customers during their purchases. The AHP method, with the support of business experts, helped us to define the different weights of each RFMScore per product category, obtaining a more complete approach to customer preferences and customer value.
The fuzzy linguistic 2-tuple RFM model adapted by the product hierarchy was revealed as a useful tool for including the business criteria, product catalogues and customer insights on the definition of commercial strategies. It is remarkable that the local weights defined per product category could be changed to follow business needs, which transforms this method into an important tool for marketers. With this methodology, companies will be able to adapt their preferences (weights of products) to support the business needs following the commercial calendar or marketing actions.
The concatenation of the global fuzzy linguistic 2-tuple RFM model with the new fuzzy linguistic 2-tuple RFM model per product hierarchy offers a more effective customer segmentation that enriches the results offered by each model separately.
As a consequence of this approach, retailers will be able to combine the two different perspectives, the customer-centric, by applying the global fuzzy linguistic 2-tuple RFM model, and the product-centric one, thanks to the fuzzy linguistic 2-tuple RFM model per product hierarchy.
Something important to remark is that, if we want to know customer preferences, we need to work with all historical products, current or out-of-the-range, related to each customer. The out-of-range products are necessary to better profile each customer, but if we need to use the insights extracted from this analysis to recommend a product to a customer, we should recommend only products that are currently "alive". Here, some retailers may have a problem if they have not saved their sales history or if they are not able to identify each ticket to each customer. Bear in mind that analysing anonymous tickets is not the same as being able to associate each purchase with the customer who has made it over time.
One difficulty encountered in the development of the empirical model was the big need of business knowledge to define the structure and weights of the hierarchical model. The joint work of business experts and researchers was necessary for the correct interpretation and definition of the hierarchies. The proposed theoretical model has been implemented using R and Python languages embedded in nodes of Knime 4.3.2. This has allowed us to verify their results on a practical and not just theoretical level. Everything is open source to help other researchers and professionals to apply our contribution.