Analyzing the Dynamics of Customer Behavior: A New Perspective on Personalized Marketing through Counterfactual Analysis

: The existing body of research on dynamic customer segmentation has primarily focused on segment-level customer purchasing behavior (CPB) analysis to tailor marketing strategies for distinct customer groups. However, these approaches often lack the granularity required for personalized marketing at the individual level. Moreover, the analysis of customer transitions between different groups has largely been overlooked. This study addresses these gaps by developing an efficient framework that enables businesses to forecast customer behavior, assess the impact of various strategies on each customer separately, and analyze customer transition between segments. This can facilitate providing personalized marketing strategies, fostering a gradual transition toward a desired customer status, and enhancing the overall marketing precision. In this study, we employ time series feature vectors encompassing recency, frequency, monetary value, and lifespan, applying the K-means algorithm with a range of distance metrics for customer segmentation along with classification algorithms to predict customer behavior. Leveraging counterfactual analysis, we establish a solution for analyzing customer transitions between groups and evaluating personalized marketing strategies. Our findings underscore the superior performance of the Euclidean distance metric, closely followed by the Manhattan distance, in distinguishing the patterns in time series customer behavior, with logistic regression excelling in predicting customer status. This study enables decision-makers to forecast the impact of diverse marketing strategies on customer behavior which facilitates customer retention and engagement through well-informed decisions.


Introduction
Customers include both individuals and businesses that purchase products or services from a company.Maintaining a good relationship with customers over a long period can lead to increased profits from existing customers for an enterprise [1].According to Kotler and Armstrong [2], while attracting customers is undoubtedly important, the retention of customers holds even greater significance.This is due to the fact that losing a customer not only results in an immediate loss but also forfeits the potential lifetime value of their purchases.To enhance customer retention and boost profitability, each organization must tailor marketing strategies, efficiently allocate resources, and effectively meet the different needs of its customer base [3,4].In recent decades, with the rise of personalized marketing in e-commerce, traditional mass marketing is becoming increasingly obsolete [5].Personalized marketing should utilize individual-level information to tailor interactions, enhancing customer experience and marketing effectiveness for a competitive advantage in a knowledge-driven world [6].The shift towards personalized marketing strategies requires a profound understanding of each customer's unique behavioral characteristics.Segmentation is a commonly employed method to achieve this objective [5,7].
Customers often exhibit diverse preferences, which makes customer segmentation a valuable strategy for effectively managing companies' relationships with their customers [8,9].Customer segmentation is a popular tool that involves grouping customers with similar characteristics and attributes from larger, more heterogeneous groups of customers [9][10][11].Traditional customer segmentation methods, such as those described by Calvet et al. [12], typically categorize customers based on descriptive variables like demographic attributes.However, when demographic data are unavailable or cannot be inferred from the existing data, these conventional segmentation techniques become impractical [13].The recency, frequency, and monetary (RFM) model is a widely used value-based approach to conducting behavioral customer segmentation [5,[14][15][16][17][18][19][20][21].Specifically, "recency" pertains to the time elapsed since the last purchase, "frequency" signifies the number of purchases within a defined time frame, and "monetary" reflects the amount of money spent during this specified period [22].These three variables fall under the category of behavioral attributes and can serve as segmentation criteria by evaluating customer attitudes towards the product, brand, benefits, or even their loyalty, all gleaned from the database [23].
The main challenge in the current literature on customer segmentation, including studies that utilize the RFM model, lies in typically employing a single time frame to encompass customer behavior, often referred to as "static segmentation" (as seen in examples such as [24][25][26][27][28][29]). The primary limitation of static segmentation approaches lies in their inability to model the dynamic behavior of customers and uncover significant trends and patterns [30][31][32].A few studies adopt a dynamic approach to customer segmentation [31].These approaches entail either monitoring customer behavior trajectories over time using techniques like sequential rule mining [32] or representing customer behavior as time series and subsequently applying time series analytical methods [15,19].The main advantage of dynamic customer segmentation utilizing time series features over other approaches lies in its predictive capability.The primary limitation associated with sequential rule mining approaches lies in their descriptive nature, as they are designed to elucidate historical trends in customer behavior rather than being suitable for forecasting future customer behavior [17].In summary, the main goal of earlier studies on customer segmentation utilizing static approaches was to segment customer behavior, analyze each segment, and identify prevailing trends.In dynamic approaches, researchers strive not only to achieve these objectives but also to predict future customer behavioral patterns.This information can then be utilized by companies' marketing departments to make well-informed decisions tailored to each customer group.Table 1 summarizes research studies on dynamic customer segmentation using time series features.The dataset used in this study is the online retail dataset, which is generated from non-store online retail transactions registered in the UK.
The research concluded that the proposed method improves the accuracy of customer value, user-level correlation analysis, and explanation of intermediary effects.It can provide marketing strategies for diverse customer segments while ensuring the quantity and quality of these groups.
As is evident from Table 1, the existing literature on dynamic customer segmentation using time series features predominantly concentrates on predicting customer behavior and analyzing purchasing patterns within distinct segments, where marketing strategies are designed for each group of customers based on shared characteristics in each segment.Our study shifts the focus to analyzing each customer separately, allowing businesses to assess and influence the purchasing behavior of their customers individually.This approach enables the design of personalized marketing strategies at the individual level, rather than the segment level, which can more effectively foster desired customer behaviors.Moreover, existing studies often overlook the transitions of customers between segments over time, while our study addresses this gap by analyzing how customers transition between segments and how CPB features should be affected by strategies for facilitating these transitions or maintaining a customer within the desired segment.In this regard, we introduce counterfactual analysis as a novel concept to the existing literature.While many studies contribute theoretical insights, fewer offer practical tools that businesses can readily implement.Our research aims to develop an efficient model that businesses can use to predict their customer behavior and assess the impact of various marketing strategies on each customer over time.This practical application should be computationally efficient to handle large datasets and is designed to enhance marketing precision and foster customer engagement and retention.
To achieve the above-mentioned objectives, our research approach begins with conducting time series clustering of customer behaviors, employing an extended version of the RFM model, and selecting an appropriate clustering algorithm based on the evaluation results.Subsequently, we employ classification algorithms to train a predictive model and forecast customer behavior status.Finally, we utilize counterfactual analysis to provide a tool that assists decision-makers in evaluating the effect of potential targeted strategies on each customer individually, aiming to retain existing customers and guide them into desired segments.
The remainder of the paper is organized as follows: Section 2 begins with an examination of algorithms suitable for time series customer segmentation.In Section 3, the proposed methodology and framework are presented.Section 4 is dedicated to providing the experimental results and Section 5 presents discussion and managerial implications.In Section 6, we draw our study to a conclusion, and finally, in Section 7, the limitations of this study and future research directions are elaborated.

Time Series Segmentation Algorithms
In this section, a variety of clustering algorithms well suited for time series segmentation are explored.Clustering algorithms can be broadly categorized into two groups: hierarchical and partitioning algorithms [33].Other studies also identified grid-based, model-based, density-based, and multi-step clustering algorithms as the primary categories of clustering methods [34].In the following, we delve into the key aspects of applying each clustering group to time series data.
Hierarchical clustering is a versatile approach in cluster analysis used to create a hierarchy of clusters using agglomerative (bottom-up) or divisive (top-down) algorithms [35].It generates nested clusters by considering pairwise distances between data points.This is accomplished by utilizing a proximity measure referred to as the linkage metric.Hierarchical clustering can be well suited for applications where the number of clusters is challenging to define, and it can handle time series of unequal lengths when equipped with elastic distance measures like Dynamic Time Warping (DTW) [36,37].Bekhin (2006) noted that hierarchical clustering algorithms that rely on linkage metrics face challenges related to their time complexity.Therefore, it is typically not well suited for efficiently handling large time series datasets [38].This limitation stems from its quadratic computational complexity, meaning that the algorithm's processing time increases quadratically with the number of data points.Ward's clustering method is an agglomerative clustering algorithm, distinct from the linkage metric approach introduced by Ward in 1963.Instead, it is founded on the objective function of K-means, with the merging decision contingent on its impact on this function.It is worth noting that this clustering method is best suited for quantitative variables, as it may not be as well suited for binary variables [39].Taking computational complexity into consideration within the broader context of clustering categorizations, it is well established that the hierarchical clustering method is characterized by a complexity of O(n 2 ) [39], which makes it less suitable for time series clustering applications.
Partitioning algorithms are used for grouping similar data points into distinct clusters or partitions [34].The dataset containing n objects is progressively divided into a predefined number (k) of separate subsets through an iterative process aimed at optimizing a specific criterion function [40].One of the widely employed algorithms in partitioning clustering is K-means [41].It aims to create clusters by minimizing the total distance between objects within each cluster and their respective prototypes, which are typically the mean values of cluster objects [41].In contrast, k-medoids (Partitioning Around Medoids-PAMs) assign the center of each cluster as one of the data points within the cluster, specifically the one that is closest to the other points in that cluster [35].However, one significant challenge in both K-means and k-medoids is the need to pre-assign the number of clusters (k), which makes it challenging for the clustering of time series data [42,43].Fuzzy clustering methods, such as Fuzzy c-Means (FCMs) and Fuzzy c-Medoids, offer a "soft" approach to clustering, where each object holds a degree of membership within each cluster [44][45][46].The computational complexity of partitioning algorithms is O(n) [39], which is significantly more efficient than that of hierarchical algorithms, where the complexity is O(n 2 ), making them a better option for time series data.
The other clustering models discussed in previous studies include model-based clustering, which seeks to recover the original model from a given dataset by assuming a model for each cluster and finding the best fit of data to that model [34].This approach involves the selection of centroids and the addition of noise with a normal distribution, which results in the recovery of a model that defines the clusters [47].Typically, model-based methods use statistical or neural network approaches.For instance, a Self-Organizing Map (SOM), a model-based clustering method based on neural networks, has been applied in time-series clustering [48].However, the SOM method struggles with time series of unequal lengths due to the need to define the dimension of weight vectors [35].Model-based clustering has two main drawbacks: the need to set parameters and the reliance on user assumptions, which can lead to inaccurate cluster results and slow processing times, especially when using neural networks on large datasets [49].Second, density-based clustering identifies clusters as subspaces of dense objects separated by subspaces of low-density objects [34].One well-known algorithm based on the density concept is DBSCAN, which expands clusters if their neighbors are dense [50].Chandrakala and Chandra propose a density-based clustering method in kernel feature space for clustering multi-variate time series data with varying lengths [51].They also introduce a heuristic method for finding the initial values of the algorithm's parameters.However, density-based clustering is not widely employed for time series data clustering due to its relatively high complexity [34].Grid-based clustering methods discretize the data space into a finite number of cells organized in a grid-like structure.Subsequently, clustering operations are carried out on these grid cells.Two prominent examples of clustering algorithms that adopt this grid-based approach are STING [52] and Wave Cluster [53].Finally, multi-step clustering involves the fusion of various techniques, often referred to as a hybrid method that is applied to enhance the overall quality of cluster representation [54,55].

Materials and Methods
A detailed, step-by-step methodology for segmenting and analyzing customer purchasing behavior utilizing the concept of counterfactual analysis is provided.Subsequently, the proposed methodology will be implemented using real-world data in the following section.

Data Understanding and Preprocessing
The dataset employed in this study consists of transactions conducted by customers of an information technology company.The dataset covers a substantial five-year period, ranging from May 2017 to March 2023, incorporating a total of 271,152 transactions attributed to 10,393 unique customers.The essential steps employed to transform the initial data into a clean, structured, and analytically ready format are detailed in the following.The goal is to enhance the quality, consistency, and usability of the dataset, establishing a solid foundation for our study.The remainder of this section covers the handling of missing values and inconsistencies, outlier detection, feature engineering, and normalization.In the first step, duplicate records and missing values are removed.Then, data are verified against accurate formats and business rules and prepared for the next step.
Dealing with Outliers in Raw Data: Since our dataset comprises time-stamped customer transactions, with each entry structured as CustomerID, InvoiceNumber, TransactionDate, and Amount, in the first step, we focused on detecting outliers within the "Amount" field.
Based on the numeric nature of the "Amount" field, we took advantage of a combined approach using the K-means algorithm as a distance-based algorithm and Z-scores to detect the outliers.To compute the Z-score for an individual transaction's amount (let us call it x i ), Equation ( 1) is applied.
where µ represents the mean (average) of all transaction amounts and σ is the standard deviation.
The Z-score, as depicted in Equation ( 1), represents a statistical measure of a score's correlation with the mean within a set of scores which is used for outlier detection [56].The primary assumption underlying this rule is that the variable X follows a normal distribution, leading to the Z-score having a standard normal distribution.Several studies demonstrate that the Z-score, in combination with a distance metric, can effectively detect outliers [57].To identify outliers using the Z-score method, a predetermined threshold is typically set.It is a common practice to set the threshold to 3, but it is important to note it depends on the normal distribution of the data and also may vary depending on the context and the dataset.To suppress the Z-score assumption and since every customer's data are valuable from an analytics perspective, a combination of the K-means algorithm with the Z-score method is employed to achieve precise and tailored outlier detection.We employed the Elbow or Within-cluster Sum of Square (WSS) method to determine the ideal number of clusters (k) for K-means clustering [58].The result indicated that the optimal value for K lies beyond or is equal to 3. Consequently, we further explored clusters with k values ranging from 3 to 9. For a detailed breakdown of the number of samples in each cluster, please refer to Table 2.

Number of Clusters Number of Samples in Each Cluster
1 The identified outliers are presented in bold font and underscored.
Alternatively, an assessment of the outliers while varying the threshold values in the Z-score method was conducted.After analyzing the results, both the K-means algorithm and the Z-score method revealed a consensus on identifying outliers when K was set to 8 and the Z-score threshold was set to 6.This agreement led to the removal of 43 outliers from the dataset.It is worth noting that when the threshold was set within the range of 3 to 5, a total of 47 outliers were detected at most.This approach can serve as a practical solution for determining the threshold of the Z-score method based on the unique characteristics of the dataset.
Feature Extraction: The popular RFM model was employed to represent customer behavior [5,7] in time intervals.The RFM model was introduced by Arthur Hughes in the 1990s to calculate the value of customer behavior.RFM encompasses recency, which offers insights into the time that has passed since the customer's last purchase, frequency, which quantifies the total number of purchases made and reflects customer loyalty within a specified timeframe, and monetary, which pertains to the average amount expended by the customer within a time interval [59].Subsequent studies extended the understanding of customer characteristics by introducing the length of customer engagement (i.e., [16,[60][61][62][63][64]) as an important feature in capturing customer behavior.In this study, an additional variable of the RFM model labeled L is utilized, which signifies the lifespan of each customer in the business.Therefore, the RFML dynamic features are defined for each customer as follows (refer to Appendix A for the details of the algorithm):

•
Recency (R) represents the number of days which have passed since the last purchase prior to the current time interval; • Frequency (F) pertains to the total number of purchases within the current time interval; • Monetary (M) signifies the total purchasing amount during the current time interval; • Lifespan (L) reflects the number of days which pass between the initial purchase and the last one in the current time interval.
This transformation results in the conversion of raw customer transaction data into RFML feature vectors that are calculated on time intervals.In determining the length of the time interval, we consider various factors, including the nature of the data, specific domain requirements, and the objectives of our analysis.Our goal is to strike a balance between capturing significant patterns within each segment without overly fragmenting the data.After careful consideration of the specified criteria, insights from the existing literature, and consultations with domain experts, a one-month time interval was selected for this study.Consequently, we extracted 37,442 samples that encapsulate the RFML dynamic features of customer behavior from the raw transactional data.In our subsequent data refinement process, we subjected each RFML feature to the outlier detection method which was used on raw data as well.Consequently, we identified and removed a total of 764 customer behavior samples that exhibited outlier characteristics.The min-max normalization technique to normalize the features, scaling them into the [0,1] range, was utilized in this part [63].

Customer Behavior Segmentation
Clustering is a complex task, where the quality of outcomes relies significantly on two critical decisions: the selection of an appropriate clustering algorithm and the choice of a suitable distance measure [64].Considering the previously discussed benefits and drawbacks of various clustering methods within the context of time series data and the time complexity associated with each method, this study aims to assess the practical utility of the commonly employed partitioning clustering technique, K-means, in the analysis of customer behavior dynamics.Given that the K-means algorithm relies on the optimization criterion involving the distances between data points, our focus is on the examination of various distance metrics to determine the most effective one in this context.In particular, our goal is to employ a variety of distance metrics, encompassing both commonly used metrics in clustering problems and those specifically designed for time series data.Furthermore, the research seeks to incorporate a multi-step clustering approach, guided by the dual objectives outlined in the following section, to gain a deeper understanding of the dynamics of customer behavior.

Customer Behavior Status Prediction
In this study, a classification algorithm is employed to predict the status of new customer behaviors as represented by the RFML feature vectors.This approach encompasses several distinct objectives.Firstly, the segment or status membership of new customer behaviors is determined using classification algorithms.A multi-step clustering methodology, encompassing both clustering and classification algorithms, is employed to enhance the overall performance of cluster prediction.Secondly, a feature importance analysis is conducted to identify the key features that make significant contributions to each segment, thereby providing valuable insights to decision-makers.Lastly, a tool inspired by counterfactual analysis is devised, empowering decision-makers to evaluate the consequences of altering individual features on the status of individual customers within a segment, as opposed to considering the entire cluster as a whole.To achieve this, a comprehensive evaluation of three classification algorithms-Random Forest, logistic regression, and Decision Tree algorithms-is undertaken.The purpose of this examination is to identify the most appropriate algorithm aligned with our research objectives.In our selection process, three fundamental criteria are established, guided by the aforementioned goals that the chosen algorithm must meet.Firstly, non-distance-based algorithms are employed to complement the distance-based clustering approach applied in the preceding step.Secondly, the algorithm is expected to yield comprehensive insights into feature importance.Lastly, the ability to provide probabilities associated with each class membership is another crucial criterion under consideration.

Counterfactual Analysis and Personalized Strategies
In this section, we aim to take advantage of the concept of counterfactuals.This term, with its roots in the works of philosophers David Hume and John Stewart Mill, has acquired computer-friendly semantics in recent decades.A common query within the counterfactual realm necessitates retrospective reasoning, often posed as, "What if I had acted differently?"In fact, counterfactuals are the building blocks of scientific thinking as well as legal and moral reasoning [64].This section employs the exploration of counterfactuals in our study.
In the domain of counterfactual analysis, we come across expressions represented as P (y|x, x ′ , y ′ ).These expressions symbolize the probability of event Y taking a specific value y under the condition that X was x, assuming we have observed X as x ′ and Y as y ′ .To illustrate this with an example, consider the probability that a customer's segment would be y "=Loyal Patron" if the number of his monthly purchases (x ′ ) grew 10 percent for 3 consecutive months, given that his actual segment is y ′ "=at Risk of Losing" and the number of his monthly purchases is x ′ .This framework allows us to explore hypothetical scenarios and evaluate the likelihood of outcomes when certain variables or conditions are altered, based on the observed data and relationships between variables.
These statements, capturing counterfactual probabilities, are computable when we have access to functional or structural equation models or related properties of such models.In other words, as expounded by Pearl in his work [65], these models provide the necessary framework for quantifying and reasoning about such counterfactual scenarios.They enable us to explore what might have occurred under different circumstances, given the observed outcomes and the underlying structural relationships between variables.In this study, the aim is to harness the potential of counterfactual analysis and provide the basis of a practical tool that empowers businesses to assess a multitude of scenarios.This approach enables the identification of highly personalized marketing proposals for individual customers, unlike traditional generalized strategies employed for each customer group.To do so, we will use the possibility provided by the algorithms chosen in the prior phase and extract the influence of each feature, thereby contributing to the comprehensive understanding of customer behavior determinants.The objective of this approach is to enhance decision making processes within the domain of customer-centric marketing strategies.Figure 1 provides an overview of the proposed methodology designed to address the objectives of this study and will be applied to real-world data in the subsequent section.group.To do so, we will use the possibility provided by the algorithms chosen in the prior phase and extract the influence of each feature, thereby contributing to the comprehensive understanding of customer behavior determinants.The objective of this approach is to enhance decision making processes within the domain of customer-centric marketing strategies.Figure 1 provides an overview of the proposed methodology designed to address the objectives of this study and will be applied to real-world data in the subsequent section.

Results
This section presents the outcomes of applying the methodology introduced in the previous section to real customer transaction data, illustrating the practical implementation of our approach.Following the preprocessing phase as detailed in the preceding section, we proceed with a step-by-step process involving customer behavior segmentation, customer behavior status prediction, and the application of counterfactual analysis.This

Results
This section presents the outcomes of applying the methodology introduced in the previous section to real customer transaction data, illustrating the practical implementation of our approach.Following the preprocessing phase as detailed in the preceding section, we proceed with a step-by-step process involving customer behavior segmentation, customer behavior status prediction, and the application of counterfactual analysis.This analysis evaluates the impact of potential marketing strategies on customer status, helping to determine tailored strategies for each customer.

Customer Purchasing Behavior Segmentation
In this step, the K-means algorithm is applied to the time series RFML feature vectors using various distance metrics.Since the K-means algorithm of the Scikit-learn library primarily relies on the Euclidean distance metric for clustering, to assess the performance of the K-means algorithm with various distance metrics, we implemented the K-means algorithm using Python version 3.11.5 within the Anaconda software version 23.7.4.Subsequently, the results were analyzed using the silhouette index [66][67][68] to ascertain which distance metric yields denser and distinctly separated clusters of customer behaviors, represented by the time series RFML feature vectors.Since the distance measure has a direct impact on the clustering quality of time series data [69], we applied Euclidean, Manhattan, and Chebyshev distance metrics, which are commonly used in clustering [39], and Dynamic Time Warping (DTW) [70], temporal correlation coefficient (CORT) [71,72], complexity-invariant distance (CID) [73], designed for time series data.The results are shown in Table 3.The results demonstrate that both Euclidean and Manhattan distances exhibit superior performance in partitioning customers' behavior data into five distinct behavioral groups.This suggests that these distance metrics are particularly effective in distinguishing the dynamics of customers' behaviors represented by RFML feature vectors, with the Euclidean distance showing slightly better performance.The result of the optimal number of clusters was entirely consistent with using the WSS method to determine the optimal k value for K-means.Table 4 presents the cluster statistics, encompassing cluster size and cluster compactness, calculated as the Within-cluster Sum of Squares. Figure 2 presents both pairwise feature analysis and a 3D visualization of the clusters.The pairwise analysis demonstrates how the segments manifest from the viewpoint of each feature pair.Notably, better segment separation is observed when examining data through the lens of lifespan, which can show the importance of this feature in CPB clustering.By integrating all plots from the pairwise analysis, we gain insights into the overall data and cluster structure, aligning with the presentation in the 3D plot in Figure 2. Following this phase, it is crucial to meticulously analyze the outcomes of each cluster based on business rules and objectives.Figure 3 illustrates the distribution of each feature within the clusters, allowing for a comparison with the overall mean across all clusters.This visualization provides valuable insights into the positioning of features within clusters and aids in extracting shared customer behaviors within each group.Cluster 0 belongs to dormant customers.Customers in this segment exhibit very high recency and very low frequency and monetary value, with a slightly above-average lifespan.This indicates that these customers have not made recent purchases, purchase very infrequently, and spend very little when they do.They may have been active in the past but churned for a while and are disengaged.Infrequent low-spenders formed cluster 1.This cluster is characterized by low recency, frequency, and monetary value and a short lifespan.These are relatively new customers who have made recent purchases but buy infrequently and spend little.Their engagement is minimal and spread over a short period.Long-term loyal customers are in cluster 2. They show moderate recency and relatively high frequency and monetary value, with a very long lifespan.They are consistent purchasers who engage regularly and spend a good amount over an extended period.Their moderate recency suggests they continue to engage with the business, making them valuable in the long term.
Cluster 3 includes high-value shoppers.This segment stands out with very low recency, very high frequency, and the highest monetary value, coupled with a long lifespan.These customers make recent, frequent purchases and spend significantly, representing the most valuable customers for the business.The last cluster, cluster 4, belongs to moderatevalue shoppers.Customers in this cluster have moderate recency and relatively moderate frequency and monetary value compared to other clusters, with an above-average lifespan.These customers have been with the business for a considerable time but are not the most frequent or high-spending buyers.They represent a segment of moderately valuable customers who could be encouraged to increase their engagement and spending through targeted promotions and personalized marketing strategies.These results enhance our understanding of common purchasing behaviors within each group and provide valuable insights for studying and evaluating customer transitions between segments.
tively new customers who have made recent purchases but buy infrequently and spend little.Their engagement is minimal and spread over a short period.Long-term loyal customers are in cluster 2. They show moderate recency and relatively high frequency and monetary value, with a very long lifespan.They are consistent purchasers who engage regularly and spend a good amount over an extended period.Their moderate recency suggests they continue to engage with the business, making them valuable in the long term.Cluster 3 includes high-value shoppers.This segment stands out with very low recency, very high frequency, and the highest monetary value, coupled with a long lifespan.These customers make recent, frequent purchases and spend significantly, representing the most valuable customers for the business.The last cluster, cluster 4, belongs to moderate-value shoppers.Customers in this cluster have moderate recency and relatively moderate frequency and monetary value compared to other clusters, with an above-aver-

Customer Purchasing Status Prediction
The Random Forest, logistic regression, and Decision Tree algorithms are utilized to predict customer behavior status.The goal is to train these algorithms using cluster labels extracted from the K-means algorithm.To identify optimal parameters for each algorithm, 5-fold cross-validation is employed.The performance of each algorithm in assigning status to each customer behavior is then assessed and compared using standard metrics including accuracy, F1 score, and Cohen's kappa as commonly used validation metrics to evaluate the performance of multi-class classification models [74].Additionally, the symmetric mean absolute percentage error (sMAPE) metric is utilized, calculated using Equation (2), as the main metric used to assess forecasting accuracy in time series competitions [75].The results are presented in Table 5, where y i and ŷi are the actual and predicted ith values, respectively.sMAPE = 100 n ∑ n t=1 The results indicate that both logistic regression and Decision Tree algorithms exhibit strong performance in predicting customer behavior.However, logistic regression consistently outperforms the alternative algorithm across all metrics, demonstrating superior predictive capabilities.Following that, the impacts of each feature on the prediction are derived from the trained logistic regression model, providing crucial inputs for the subsequent counterfactual analysis.

Counterfactual Analysis and Personalized Strategies
In this phase, the primary objective is to assess the influence of each feature on customers transitioning between different segments or statuses.By identifying key features driving CPB, businesses can tailor marketing strategies to individual customers and potentially guide them towards desired outcomes if necessary.This approach not only has the potential to increase the likelihood of success in marketing endeavors but also holds the promise of increasing revenue for companies.Therefore, our ultimate goal is to provide companies with a reliable tool to evaluate the impact of each offer on the customer's purchasing behavior before its implementation.The information derived from predicting the customer's purchase status, which shows the importance of each feature in the forecasted scenario, lays the essential groundwork for this part.The equation derived from the logistic regression model, as depicted in Equation ( 3), enables the assessment and quantification of the probability of a future behavior aligning with each segment.
Suppose we have a multi-class problem with C classes (C ≥ 2).Equation (3) provides the probability that a new observation, denoted as x, belongs to class c, with ω = β T 1 ; β T 2 ; ...; β T C , ω ∈ R (C)d being a collection of the different parameter vectors of C linear models.The parameter vectors of the models are employed to assess the influence of each feature on the customer status and CPB transitions, as illustrated in Figure 4.This equation serves as a basis for developing a framework to forecast the effects of different potential strategies on customer purchasing behavior.It provides valuable insights into guiding customers to transition from a suboptimal cluster, which may not align with the business's objectives.Adjusting behavior features through marketing offers can facilitate this transition toward a desired cluster.By formulating the effect of potential marketing scenarios on R, F, M, and L parameters for each customer, businesses can evaluate the impact of these scenarios on customer purchasing behavior before their implementation and predict customer status.To demonstrate the applicability of our framework, let us delve into the transition from cluster 0, representing dormant customers, to cluster 3, comprising high-value shoppers.As depicted in Figure 4, the frequency of purchases exerts the most significant influence on this transition.Following closely is the reduction in the time intervals between purchases.To facilitate such transitions effectively, the company should implement strategies that encourage customers to make more frequent purchases distributed in shorter time intervals.In this scenario, emphasizing a higher transactional monetary value has a comparatively less potent impact on this transition.Table 6 demonstrates the results of implementing some potential scenarios on customer behavior #1944 in the dataset under study.The current behavior of this customer falls under cluster 0. Analyzing the feature distribution of this cluster, as depicted in Figure 3, revealed this cluster comprises customers who had churned and did not make a purchase in a while (approximately a year) but have recently re-engaged with a new purchase.This customer behavior is analyzed to determine the potential scenarios that cause the customer to transition to a more engaged cluster such as clusters 2, 3, and 4. Table 6 demonstrates the results of implementing some potential scenarios on customer behavior #1944 in the dataset under study.The current behavior of this customer falls under cluster 0. Analyzing the feature distribution of this cluster, as depicted in Figure 3, revealed this cluster comprises customers who had churned and did not make a purchase in a while (approximately a year) but have recently re-engaged with a new purchase.This customer behavior is analyzed to determine the potential scenarios that cause the customer to transition to a more engaged cluster such as clusters 2, 3, and 4. trained logistic regression model to forecast the next status of the customer if the strategy is implemented.Table 6 presents three scenarios with different behaviors and their impact on the customer's transition.In the first scenario, the customer makes a purchase within the next three months, reducing their recency while keeping their current monetary value unchanged.Figure 4 indicates that a reduction in recency has the most significant effect on the customer's transition from cluster 0 to cluster 4.This decrease in the recency of CPB #1944 would transition it from a dormant customer to a moderately engaged one.In the second scenario, by providing proper marketing offers, we need to increase the purchase frequency by a factor of four, and the customer should make purchases within the next two months.This needs a significant change in customer behavior, and marketing experts can devise a step-by-step transition through this segment and turn the customer into a high-value shopper belonging to cluster 3.In the third scenario, the customer maintains the current frequency and monetary value but makes purchases every two months over the next two years.This consistent engagement reduces recency, and after two years, increasing the lifespan, CPB #1944 transitions to cluster 2 to become a long-term loyal customer.The result facilitates the decision making process for determining effective marketing propositions to offer to the customer.The aim is to expedite the transition of customers to a desired cluster or state, ensuring optimal outcomes in terms of speed and efficacy.
To further explore the application of this methodology, Table 7 examines the impact of different scenarios on CPB #3309, currently categorized under cluster 3 as a high-value shopper.Despite them being a valuable customer, as revealed in the results presented in Figure 3, we aim to investigate scenarios that could keep the customer in the current status or potentially lead to a transition into an undesired status of clusters 0, 2, and 4 for this customer.This information can assist decision-makers in developing proactive strategies to prevent the customer from moving out of a profitable cluster.trained logistic regression model to forecast the next status of the customer if the strategy is implemented.Table 6 presents three scenarios with different behaviors and their impact on the customer's transition.In the first scenario, the customer makes a purchase within the next three months, reducing their recency while keeping their current monetary value unchanged.Figure 4 indicates that a reduction in recency has the most significant effect on the customer's transition from cluster 0 to cluster 4.This decrease in the recency of CPB #1944 would transition it from a dormant customer to a moderately engaged one.In the second scenario, by providing proper marketing offers, we need to increase the purchase frequency by a factor of four, and the customer should make purchases within the next two months.This needs a significant change in customer behavior, and marketing experts can devise a step-by-step transition through this segment and turn the customer into a high-value shopper belonging to cluster 3.In the third scenario, the customer maintains the current frequency and monetary value but makes purchases every two months over the next two years.This consistent engagement reduces recency, and after two years, increasing the lifespan, CPB #1944 transitions to cluster 2 to become a long-term loyal customer.The result facilitates the decision making process for determining effective marketing propositions to offer to the customer.The aim is to expedite the transition of customers to a desired cluster or state, ensuring optimal outcomes in terms of speed and efficacy.
To further explore the application of this methodology, Table 7 examines the impact of different scenarios on CPB #3309, currently categorized under cluster 3 as a high-value shopper.Despite them being a valuable customer, as revealed in the results presented in Figure 3, we aim to investigate scenarios that could keep the customer in the current status or potentially lead to a transition into an undesired status of clusters 0, 2, and 4 for this customer.This information can assist decision-makers in developing proactive strategies to prevent the customer from moving out of a profitable cluster.trained logistic regression model to forecast the next status of the customer if the strategy is implemented.Table 6 presents three scenarios with different behaviors and their impact on the customer's transition.In the first scenario, the customer makes a purchase within the next three months, reducing their recency while keeping their current monetary value unchanged.Figure 4 indicates that a reduction in recency has the most significant effect on the customer's transition from cluster 0 to cluster 4.This decrease in the recency of CPB #1944 would transition it from a dormant customer to a moderately engaged one.In the second scenario, by providing proper marketing offers, we need to increase the purchase frequency by a factor of four, and the customer should make purchases within the next two months.This needs a significant change in customer behavior, and marketing experts can devise a step-by-step transition through this segment and turn the customer into a high-value shopper belonging to cluster 3.In the third scenario, the customer maintains the current frequency and monetary value but makes purchases every two months over the next two years.This consistent engagement reduces recency, and after two years, increasing the lifespan, CPB #1944 transitions to cluster 2 to become a long-term loyal customer.The result facilitates the decision making process for determining effective marketing propositions to offer to the customer.The aim is to expedite the transition of customers to a desired cluster or state, ensuring optimal outcomes in terms of speed and efficacy.
To further explore the application of this methodology, Table 7 examines the impact of different scenarios on CPB #3309, currently categorized under cluster 3 as a high-value shopper.Despite them being a valuable customer, as revealed in the results presented in Figure 3, we aim to investigate scenarios that could keep the customer in the current status or potentially lead to a transition into an undesired status of clusters 0, 2, and 4 for this customer.This information can assist decision-makers in developing proactive strategies to prevent the customer from moving out of a profitable cluster.To initiate the transition process, employing insights from Figure 4, hypothetical scenarios for this customer should be devised which aim at guiding them into the desired clusters.Subsequently, for each scenario, the changes induced by the scenario are translated into their R, F, M, and L features.In the following step, the question "What if the customer behaves like this?" is addressed to assess the effect of the scenario on the subsequent state of the customer.These hypothetical scenarios can be evaluated using the trained logistic regression model to forecast the next status of the customer if the strategy is implemented.Table 6 presents three scenarios with different behaviors and their impact on the customer's transition.In the first scenario, the customer makes a purchase within the next three months, reducing their recency while keeping their current monetary value unchanged.Figure 4 indicates that a reduction in recency has the most significant effect on the customer's transition from cluster 0 to cluster 4.This decrease in the recency of CPB #1944 would transition it from a dormant customer to a moderately engaged one.In the second scenario, by providing proper marketing offers, we need to increase the purchase frequency by a factor of four, and the customer should make purchases within the next two months.This needs a significant change in customer behavior, and marketing experts can devise a step-by-step transition through this segment and turn the customer into a high-value shopper belonging to cluster 3.
In the third scenario, the customer maintains the current frequency and monetary value but makes purchases every two months over the next two years.This consistent engagement reduces recency, and after two years, increasing the lifespan, CPB #1944 transitions to cluster 2 to become a long-term loyal customer.The result facilitates the decision making process for determining effective marketing propositions to offer to the customer.The aim is to expedite the transition of customers to a desired cluster or state, ensuring optimal outcomes in terms of speed and efficacy.
To further explore the application of this methodology, Table 7 examines the impact of different scenarios on CPB #3309, currently categorized under cluster 3 as a high-value shopper.Despite them being a valuable customer, as revealed in the results presented in Figure 3, we aim to investigate scenarios that could keep the customer in the current status or potentially lead to a transition into an undesired status of clusters 0, 2, and 4 for this customer.This information can assist decision-makers in developing proactive strategies to prevent the customer from moving out of a profitable cluster.Inactive for three months X

3.
Inactive for seven months (R↑) X sitions to cluster 2 to become a long-term loyal customer.The result facilitates the decision making process for determining effective marketing propositions to offer to the customer.The aim is to expedite the transition of customers to a desired cluster or state, ensuring optimal outcomes in terms of speed and efficacy.
To further explore the application of this methodology, Table 7 examines the impact of different scenarios on CPB #3309, currently categorized under cluster 3 as a high-value shopper.Despite them being a valuable customer, as revealed in the results presented in Figure 3, we aim to investigate scenarios that could keep the customer in the current status or potentially lead to a transition into an undesired status of clusters 0, 2, and 4 for this customer.This information can assist decision-makers in developing proactive strategies to prevent the customer from moving out of a profitable cluster.Decreasing F by a factor of two (F↓) X X Reduce F by a factor of two after two years (F↓L↑) X X ↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.
The first scenario indicates that maintaining the current metrics of recency, frequency, and monetary value for CPB #3309 over the next two years does not change the customer's current status.The second and third scenarios examine the impact of purchase recency alterations on customer status.The results show that if this customer becomes inactive for three months, the status remains unchanged.However, seven months of inactivity causes the customer to shift to cluster 0, which belongs to dormant customers.The last two scenarios indicate that if the purchase frequency decreases by half in the short term, the customer will move to cluster 4, which includes moderate-value shoppers.Conversely, if this reduction happens gradually over two years, the customer shifts to cluster 2 and remains loyal, maintaining good frequency and monetary value.Nevertheless, they will no longer be categorized as a high-value customer in cluster 3.These assessments enable businesses to identify which changes in customer purchasing behavior will affect a customer's current status within a desired cluster.For instance, while the period of inactivity may be manageable for the business if it spans three months for this customer, extending it to more than three months signifies a potential churn risk and leads the customer to cluster 0. In such cases where transitions are suboptimal for a business, strategic marketing initiatives can be devised to steer the customer back toward favorable engagement.Figure 5 provides further insights into the effort required for transitioning between different segments.It is calculated based on coefficients assigned to each cluster obtained from the prediction model.The first scenario indicates that maintaining the current metrics of recency, frequency, and monetary value for CPB #3309 over the next two years does not change the customer's current status.The second and third scenarios examine the impact of purchase recency alterations on customer status.The results show that if this customer becomes inactive for three months, the status remains unchanged.However, seven months of inactivity causes the customer to shift to cluster 0, which belongs to dormant customers.The last two scenarios indicate that if the purchase frequency decreases by half in the short term, the customer will move to cluster 4, which includes moderate-value shoppers.Conversely, if this reduction happens gradually over two years, the customer shifts to cluster 2 and remains loyal, maintaining good frequency and monetary value.Nevertheless, they will no longer be categorized as a high-value customer in cluster 3.These assessments enable businesses to identify which changes in customer purchasing behavior will affect a customer's current status within a desired cluster.For instance, while the period of inactivity may be manageable for the business if it spans three months for this customer, extending it to more than three months signifies a potential churn risk and leads the customer to cluster 0. In such cases where transitions are suboptimal for a business, strategic marketing initiatives can be devised to steer the customer back toward favorable engagement.Figure 5 provides further insights into the effort required for transitioning between different segments.It is calculated based on coefficients assigned to each cluster obtained from the prediction model.The first scenario indicates that maintaining the current metrics of recency, frequency, and monetary value for CPB #3309 over the next two years does not change the customer's current status.The second and third scenarios examine the impact of purchase recency alterations on customer status.The results show that if this customer becomes inactive for three months, the status remains unchanged.However, seven months of inactivity causes the customer to shift to cluster 0, which belongs to dormant customers.The last two scenarios indicate that if the purchase frequency decreases by half in the short term, the customer will move to cluster 4, which includes moderate-value shoppers.Conversely, if this reduction happens gradually over two years, the customer shifts to cluster 2 and remains loyal, maintaining good frequency and monetary value.Nevertheless, they will no longer be categorized as a high-value customer in cluster 3.These assessments enable businesses to identify which changes in customer purchasing behavior will affect a customer's current status within a desired cluster.For instance, while the period of inactivity may be manageable for the business if it spans three months for this customer, extending it to more than three months signifies a potential churn risk and leads the customer to cluster 0. In such cases where transitions are suboptimal for a business, strategic marketing initiatives can be devised to steer the customer back toward favorable engagement.Figure 5 provides further insights into the effort required for transitioning between different segments.It is calculated based on coefficients assigned to each cluster obtained from the prediction model.
The analysis of Figure 5 reveals that transitioning from cluster 1 (infrequent low spenders) to cluster 2 (long-term loyal customers) poses the most significant challenge.Conversely, it suggests that moving between clusters 0 (dormant customers) and 4 (moderate value shoppers) appears notably more manageable.
extending it to more than three months signifies a potential churn risk and leads the customer to cluster 0. In such cases where transitions are suboptimal for a business, strategic marketing initiatives can be devised to steer the customer back toward favorable engagement.Figure 5 provides further insights into the effort required for transitioning between different segments.It is calculated based on coefficients assigned to each cluster obtained from the prediction model.The analysis of Figure 5 reveals that transitioning from cluster 1 (infrequent low spenders) to cluster 2 (long-term loyal customers) poses the most significant challenge.Conversely, it suggests that moving between clusters 0 (dormant customers) and 4 (moderate value shoppers) appears notably more manageable.

Discussion and Managerial Implications
Traditional "static segmentation" methods capture customer behavior at a single point in time and treat extended periods as a single snapshot.This approach overlooks the dynamic changes in customer behavior over time, limiting its ability to predict future purchasing behaviors.Our dynamic segmentation approach, on the other hand, captures customer behavior in shorter, more frequent time intervals.This method allows for the tracking of evolving patterns and training predictive models to forecast customer purchasing behavior over time.Consequently, it provides a more dynamic and comprehensive analysis of customer interactions.
By shifting the focus from segment-level to individual customer behavior analysis and leveraging time series segmentation and counterfactual analysis, we proposed a framework that provides a basis for more precise and personalized marketing efforts.To capture the dynamics of customer purchasing behavior, we first represented customer behavior using time series feature vectors and then conducted dynamic segmentation.In the second part, a logistic regression model was trained to forecast future customer purchasing behavior.Utilizing the capabilities of the trained model, we extracted the impact of each behavioral feature on customer transition between segments, which provides insights for devising personalized marketing strategies based on the customer's current behavior.In the final part, by leveraging counterfactual analysis and the trained predictive model, we could evaluate the effect of various marking strategies on the future behavior of the customer, which empowers businesses to implement the best strategy for each customer separately.This approach enables businesses to predict how changes in marketing strategies impact individual customer behaviors before implementation.This level of personalization allows for more effective marketing campaigns, as strategies can be tailored to the specific behavioral characteristics of each customer, thereby fostering stronger customer relationships and enhancing loyalty.The implications of this approach are manifold, ranging from improved customer retention to increased revenue through tailored marketing interventions.Moreover, this approach was designed to handle large datasets efficiently, making it suitable for real-world applications.
One of the novel contributions of our research is the focus on customer transitions between segments over time.By analyzing how customers transition between segments and identifying the features that drive these transitions, businesses can develop strategies to either retain customers in desirable segments or guide them toward more valuable segments.This dynamic approach ensures that marketing efforts are continually adapted to the evolving behaviors and preferences of customers, thus maintaining relevance and effectiveness.
The experimental results presented in our study provide valuable insights into the effectiveness of the clustering and prediction algorithms for customer purchasing behaviors.The superior performance of the Euclidean distance metric in clustering and the logistic regression algorithm in predicting customer behavior status underscores the importance of choosing appropriate methodologies for different aspects of customer behavior analysis.
This case study was conducted on data from an information technology company that provides various packages of services, software, and hardware products.The transaction data utilized in this study allow for deriving the customer purchasing behavior characteristics required for applying the extended version of the RFM model.These types of data are commonly stored by a diverse range of businesses, including retail stores, B2B wholesalers, banks, and telecom companies.These businesses regularly capture and maintain similar transaction data, making our approach broadly applicable across various e-commerce contexts.Consequently, our methodology has the potential for wide implementation and can provide valuable insights and personalized marketing strategies for companies operating in different sectors.
From a managerial perspective, the adoption of our proposed framework requires a shift in mindset from segment-based marketing to a more individualized approach.Managers should invest in advanced data analytics tools and develop the necessary skills within their teams to analyze and interpret complex customer data.Additionally, a continuous feedback loop should be established to monitor the effectiveness of implemented strategies and make necessary adjustments based on real-time data insights.This proactive and data-driven approach to customer management can significantly enhance the strategic decision making process, leading to more successful marketing outcomes.

Conclusions
In the context of business and industry, the ability to forecast the impact of marketing offers on customer behavior before their implementation is crucial for guiding strategic decisions, optimizing resource allocation, and ensuring the success of marketing campaigns.In this study, we conducted time series customer segmentation, developed a predictive model to forecast customer status, and analyzed the dynamics of customer behavior at an individual level, moving beyond the segment-level approach focused on in previous studies.We addressed customer transitions between segments, an aspect previously overlooked.To achieve this objective, we employed counterfactual analysis.This approach provided the analytical basis to analyze customer transitions and examine the outcomes of different strategies on customer behavior before their implementation and design marketing actions at an individual level.
Our study began by evaluating the practical applicability of the K-means clustering technique, employing various distance metrics, encompassing both commonly used metrics and those specifically designed for time series data.Customer behavior was represented by time series feature vectors including recency, frequency, monetary value, and lifespan extracted at one-month intervals.The results showed the potential of Euclidean and Manhattan distance metrics in effectively separating customer data into distinct behavior groups, with the Euclidean distance exhibiting slightly better performance.This highlights their capacity to distinguish the patterns in time series customer behavior compared to other distance metrics utilized in this study.Subsequently, we investigated the performance of the Random Forest, logistic regression, and Decision Tree algorithms in predicting customer behavior status.Our exploration into classification algorithms highlighted the proficiency of the logistic regression algorithm in predicting customer behavior status, achieving remarkable performance metrics with an accuracy of 0.9981, F1 score of 0.999, Cohen's kappa of 0.999, and sMAPE of 0.345.Consequently, our analysis of potential scenarios involving the alteration of customer behavior characteristics through marketing offers underscores the applicability and efficacy of our framework.Our findings demonstrate how businesses can strategically guide customers from suboptimal behavior to a targeted status, while also effectively maintaining customers in their optimal behavior.This insight is invaluable for businesses seeking to optimize their marketing strategies and establish long-term customer satisfaction and loyalty.

Limitations and Future Research Directions
While our study contributes valuable insights to the realm of customer purchasing behavior analytics, it is essential to acknowledge some limitations inherent in our research design.Our analysis relies on data derived from a single-time experiment conducted with a specific company.Given that effective customer relation and retention management is an ongoing endeavor necessitating continuous segmentation and state predictions, our reliance on a one-time experiment raises methodological considerations.
Adapting to the evolving landscape of customer behavior segmentation and predictive models ideally requires retraining on updated data.Several strategies can be implemented to maintain accuracy and relevance in forecasting customer behavior.An automated data pipeline can be implemented to ensure models are trained on the latest data, minimizing the lag between data collection and model updates.For instance, if new transaction data are collected daily, they can be automatically integrated into the training process, ensuring the models reflect the most current customer behavior.Alternatively, if immediate updates are not required or feasible, establishing a regular retraining schedule, such as monthly or quarterly, ensures models incorporate recent trends.Additionally, creating a feedback loop to compare predicted and actual behaviors can refine models for better accuracy.Consequently, by setting predefined performance thresholds, models can be retrained automatically when their performance drops below these levels.Training the model on such data may inadvertently compromise prediction quality, as these customers might have exhibited undesirable behavior but were adjusted successfully.To address this issue, previously targeted customers can be excluded from the training set or additional features should be incorporated to capture historical information about customer behavior statuses.Further research is required to address this issue.
To enhance the development of marketing strategies, a key avenue for improvement involves incorporating additional data types, specifically demographic characteristics of customers and their interactions with the business.This descriptive information can provide more comprehensive insights for developing customized marketing and retention strategies tailored to customer preferences.However, in our current setup, relying on data from the specific company, data availability constrained our ability to integrate such features.Future endeavors are needed to study such challenges and opportunities to enhance marketing and retention capabilities.Additionally, longitudinal studies that track the effectiveness of marketing strategies over extended periods can provide valuable insights into the evolution of customer behavior.
Our next step involves addressing the limitations of this study and developing a software application grounded in the mathematical insights derived from logistic regression and counterfactual analysis.This tool will incorporate a continuous training model that periodically updates vector parameters.The aim is two-fold: to provide businesses with the required foundation for evaluating various marketing offers for each customer individually and to offer a step-by-step approach, simplifying the process of transitioning customers to a targeted status from the business perspective.

Figure 1 .
Figure 1.Proposed methodology for dynamic customer behavior analysis and evaluating personalized marketing strategies.

Figure 1 .
Figure 1.Proposed methodology for dynamic customer behavior analysis and evaluating personalized marketing strategies.

Figure 3 .
Figure 3. Analyzing feature distributions across clusters compared to overall mean values: a visual examination in our case study.

Figure 4 .
Figure 4. Analyzing the impact of each feature on customer transition: insights from the logistic regression predictive model.

Figure 4 .
Figure 4. Analyzing the impact of each feature on customer transition: insights from the logistic regression predictive model.
within the next 3 months with the current alue (R↓) X X easing F by a factor of 4 and making purchases within the two months (R↓F↑) X X e purchases with current values of F and M every two ths for two years (R↓L↑) X X ↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.
metrics for the next two years X ctive for three months X ctive for seven months (R↑) X X X ↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.
a factor of two after two years (F↓L↑) X eor.Appl.Electron.Commer.Res.2024, 19, FOR PEER REVIEW 16Decreasing F by a factor of two (F↓) X XReduce F by a factor of two after two years (F↓L↑) X X ↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.

X↓(
Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.

Figure 5 .
Figure 5. Effort analysis for customer purchasing behavior transitions.

Figure 5 .
Figure 5. Effort analysis for customer purchasing behavior transitions.

Table 1 .
Recent studies on dynamic customer segmentation utilizing time series features.

Table 3 .
K-means clustering with a varied number of clusters and distance measures: silhouette results.

Table 4 .
Analyzing cluster size and compactness in K-means using Euclidean distance.

Table 5 .
Performance in predicting customer behavior status: evaluation metrics comparison.

Table 6 .
Potential scenarios for transitioning customer behavior #1944 to a desired cluster.

Table 6 .
Potential scenarios for transitioning customer behavior #1944 to a desired cluster.
↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.

Table 7 .
Potential scenarios for maintaining customer behavior #3309 in the desired cluster and their outcomes on customer status.

Table 6 .
Potential scenarios for transitioning customer behavior #1944 to a desired cluster.
↓ (Down arrow) indicates a decrease or reduction in the value of the corresponding feature.↑ (Up arrow) indicates an increase or augmentation in the value of the corresponding feature.

Table 7 .
Potential scenarios for maintaining customer behavior #3309 in the desired cluster and their outcomes on customer status.

Table 6 .
Potential scenarios for transitioning customer behavior #1944 to a desired cluster.

Table 7 .
Potential scenarios for maintaining customer behavior #3309 in the desired cluster and their outcomes on customer status.

Table 7 .
Potential scenarios for maintaining customer behavior #3309 in the desired cluster and their outcomes on customer status.

Table 7 .
Potential scenarios for maintaining customer behavior #3309 in the desired cluster and their outcomes on customer status.