Utility in Time Description in Priority Best–Worst Discrete Choice Models: An Empirical Evaluation Using Flynn’s Data

: Discrete choice models (DCMs) are applied in many fields and in the statistical modelling of consumer behavior. This paper focuses on a form of choice experiment, best–worst scaling in discrete choice experiments (DCEs), and the transition probability of a choice of a consumer over time. The analysis was conducted by using simulated data (choice pairs) based on data from Flynn’s (2007) ‘Quality of Life Experiment’. Most of the traditional approaches assume the choice alternatives are mutually exclusive over time, which is a questionable assumption. We introduced a new copula-based model (CO-CUB) for the transition probability, which can handle the dependent structure of best–worst choices while applying a very practical constraint. We used a conditional logit model to calculate the utility at consecutive time points and spread it to future time points under dynamic programming. We suggest that the CO-CUB transition probability algorithm is a novel way to analyze and predict choices in future time points by expressing human choice behavior. The numerical results inform decision making, help formulate strategy and learning algorithms under dynamic utility in time for best–worst DCEs.


Introduction
Modelling choice behaviors has always been a very challenging part in statistical research.Best-worst scaling (BWs) choice pairs in model behaviors has been coined by Louviere and Woodworth (1990) [1].With the addition of the attribute and attribute levels, the logistic distributions are convenient tools from analytic and practical aspects; they rely on assumptions of ordinality in the responses that can only be met under limited situations.Furthermore, the computational issues related to the large choices in the bestworst cases hamper patterns in the preferences.Discrete choice experiments (DCE) are one of popular models used to quantify the inspiration of attributes, which characterize the choice options.In the literature, Train (2009) [2] present multiple models based on different assumptions about the distribution of random components.In some of his suggested models, the error terms seem to be homogeneous and uncorrelated.Lancsar et al. (2013) [3] used a form of choice experiment, "Best-Worst Discrete Choice Experiments", which is designed in a way by asking respondents to choose not only the best option in a choice set but also the worst option followed by the best from the remaining options and so on until the implied preference ordering of the option is obtained.Marley and Louviere (2005) [4] and Marley et al. (2008) [5] discuss some of the theoretical aspects of attributelevel best-worst experiments and Flynn et al. (2007) [6] provides detailed instructions for the analysis of such experiments.Following Marley et al. (2008) [5], Street and Knox (2012) [7] constructed a probability function called best-worst choice probability which can be used to find the probability of choice from one state to another.Working et al. (2018) [8] used a conditional logit approach to calculate the transition (of conditional choice) probabilities over time assuming that the consumer behavior is a time-dependent process.
The assumption of "making two choices in two different time points is independent process for an individual" cannot be justified in general.As mentioned in Potoglou et al. (2011) [9], the interaction between attributes and choice pairs needs further piloting and investigation.The dependence in the modelling of the choices made over time is not fully included in many research venues.The dependence in the selection is a wide and fast field, but the theory is biased towards the estimation method.The premise of the traditional choice models assumes of mutually exclusive alternatives; that is, if a choice is made, alternates are not selected anymore.An alternative would be to build patterns of the preferences and their weighted measures.That option opens a new framework for choices, especially those that are made over time.Many factors shape the choices made.We include a priority in the selection over times and build it in the design.Specific association in the choices is highlighted to show the need to develop progressive models where temporal dependence in the choices is integrated, and utility is computed.In that sense, the innovation we propose is built under the copula theory, as it captures the dependence among the choices made over time periods irrespective of the marginal distributions.We utilized a completely new copula structure call 'CO-CUB' to calculate transition probabilities in the BWs choices over time.
The paper is organized as follows.In Section 2, DCMs are reviewed.The design for the BW choice pairs and their utilities are also included under a priority setting.The transition probabilities under CO-CUB are described in Section 3. We present the model's theory overtime sequence to quantify and measure consumer behavior and derive utilities using the Markov decision process (MDPs) in Section 4. Section 5 is about the application.Empirical data from Flynn et al. (2007) [6] are simulated and our methods are tested, which have the ability to estimate selected significant parameters such as the "feeling" and "uncertainty" of human beings when making a choice.The literature on Copula for DCE is at its infancy by utilizing our newly constructed probability model to analyze the choices of individuals, comparisons are made, and we end up with a conclusion.

Preliminary Results for Best-Worst Discrete Choice Modelling
Best-worst scaling experiments are modified DCEs to elicit further information about the best and worst product, or best and worst attributes and attribute-levels of a product.Respondents are asked make choices between different attributes within a same product; that is the alternatives are the attributes.In the experiment, each of the alternatives is described by a set of K attributes or characteristics of the product or scenario being modelled.Further, each attribute contains l k levels.That is, each product is represented by a profile X = (A 1 , A 2 , . . . ,A K ), where A i represents an attribute level of a choice.Consider K profiles such that P 1 = (A 11 , A 12 , . . . ,A 1K ), P 2 = (A 21 , A 22 , . . . ,A 2K ), . . . ,P G = (A G1 , A G2 , . . . ,A GK ).For every profile, the choice set is then, C where the first attribute level is the best and the second is the worst, called best-worst pairs.This model is designed in such way that the choice pair (x a , x b ), x a , and x b cannot be acquired from the same attribute.From the choice set C X , the respondent determines a BW choice pair from τ such pairs, where τ = K(K − 1).If we consider G products, that is, G profiles and G associated choice sets, then we can define a finite set S that is S = {C 1 , C 2 , . . . ,C G } such that C i ̸ = C j for all i ̸ = j.The BW choice pairs are illustrated in Figure 1 and shows all the possible combinations.
Total number of attribute levels and the total number of unique attribute-level pairs where l k is the number of levels for k th attribute, where 1 ≤ k ≤ K. Equation ( 1) is also described in Street and Knox (2012) [7].Total number of attribute levels

𝐿 = ∑ 𝑙
and the total number of unique attribute-level pairs where  is the number of levels for  attribute, where 1 ≤  ≤ .Equation (1) i described in Street and Knox (2012) [7].

Design of Experiment
Assume that the profiles have  attributes, and each attribute has  levels, w  = 1,2, … , .The simulation has performed in such a way that each person has to choices in all sets.The debate concerning optimal designs for DCEs could be enhanc a closer connection to the established literature on the optimal design of experimen suggested in Sun et al. (2023) [10].We used orthogonal design here for the simula There are three standard models in best-worst choice model design experiments w are: (1) paired, (2) marginal, and (3) marginal sequential models.In this manuscrip used the paired model Aizaki and Fogarty (2019) [11] and Marley and Louviere (2005 which assumes that the difference in utility between the two levels represents the gre utility difference among all  utility differences.

Design of Experiment
Assume that the profiles have K attributes, and each attribute has l k levels, where k = 1, 2, . . ., K. The simulation has performed in such a way that each person has to make choices in all sets.The debate concerning optimal designs for DCEs could be enhanced by a closer connection to the established literature on the optimal design of experiments as suggested in Sun et al. (2023) [10].We used orthogonal design here for the simulation.There are three standard models in best-worst choice model design experiments which are: (1) paired, (2) marginal, and (3) marginal sequential models.In this manuscript, we used the paired model Aizaki and Fogarty (2019) [11] and Marley and Louviere (2005) [4], which assumes that the difference in utility between the two levels represents the greatest utility difference among all τ utility differences.[6] used a paired model in his "Quality of life" analysis.In this approach, the attribute variables are created as dummy-coded variables (attribute-specific constants), with reversed signs when the attributes are treated as the worst and the level variables are created as effect coding with a base level for each attribute.The signs of the effect-coded level variables are also reversed when levels are treated as the worst.When estimating this model, an arbitrary attribute variable is omitted, and the coefficient of the omitted attribute variable is normalized to be zero.That is, the coefficients of the remaining attribute variables are estimated relative to the omitted attribute.Unlike dummy coding, effect coding allows the calculation of a coefficient of the base level in each attribute: it is the negative of the sum of the remaining coefficients in each attribute.
Let us consider the general scenario to the formulation for the paired model.Consider the K attributes A 1 , A 2 , . . ., A K ; each has l k levels.Assume that attribute A K is omitted, and that the last level (l k ) in each attribute (i.e., levels A 1l k , A 2l k , . . .and A Kl k ) is the base level.As in Aizaki and Fogarty (2019) [11], the systematic component of the utility function is then: where the βs are the coefficients to be estimated and Ms are the design matrices associated with the attributes and attribute levels.The coefficient of the attribute variable corresponding to A K is zero, and the coefficients of the base level A jl k , j = 1, 2, . . ., K can be calculated by A jl k = − ∑ k−1 i=1 β A ji for j = 1, 2, . . ., K. Detailed calculations are shown in the application section.M is composed of indicators for the best and worst attributes and attribute levels.Let M be the Z × P design matrix, where Z = ∑ K k=1 l k (L − l k ) and P = Kl k − 1. Columns correspond to attribute A K and the base levels (A 1l k , A 2l k , . . .and A Kl k ) are omitted from the design matrix.Consider the choice pair x ij , x ij ′ from the choice set C i , for i = 1, 2, . . ., G, j ̸ = j ′ = 1, 2, . . ., K, and 1 ≤ x ij ≤ l k .For Let M A h x ih be the data for the attribute level 1 ≤ x ih ≤ l h , h = 1, 2, . . ., K − 1 (nonbase levels) within attribute A J , ∀j = 1, 2, . . ., K. Referring to the choice pair x ij , x ij ′ , the corresponding data for the attribute-levels are given by, Within attribute A J , ∀j = 1, 2, . . ., K, the base levels (A 1l k , A 2l k , . . .and A Kl k ) Table 1 shows the design matrix for the based on data from Flynn et al. (2007) [6] 'Quality of Life experiment.'The estimated parameters of the choices are given in Table 2.The BW choice pairs are then arranged according to their popularity (with highest number of picks) in Table 3.

Utility Function-Attribute-Level Best-Worst Design
The utility function as described in McFadden (1974) [12] for the i th consumer/individual selecting the j th choice is given as: where U ij is the utility for i th consumer selecting j th choice.V ij captures the systematic component and ε ij captures the error component.The distribution of the error terms is studied in McFadden (1974) [12].He proposed the type I extreme value distribution (Gumbel distribution) for the error terms, which modelled data from the conditional logit model.However, this utility function can only accommodate single choice cases.Under the random utility theory, the probability of an alternative is based on the utility as defined in Equation ( 6); Flynn et al. (2007Flynn et al. ( , 2008) ) [6,13] and Louviere et al. ( 2008) [14] provided the utility function for best-worst choice models.Here, we consider the choice set C, x ij , x ij ′ being the chosen pair, and the utility for choosing this pair within set C is then given by: where U ijj ′ is the utility value for i th consumer selecting j th choice as the best and j ′th as the worst.V ijj ′ is the corresponding systematic component and ε ijj ′ is the error term.The systematic component in Equation ( 7) can be calculated under the above-described paired model (Equation ( 2)).

Transition Probability
In the context of choice models, we say that the choices are not independent of an irrelevant alternative (IIA) and they are also time dependent.Therefore, we may not achieve an accurate overall performance by not accounting for the relationship between the present and future decisions and their outcomes.Our interest is with discrete time finite horizon Markov Decision Processes (MDP); that is, t = 1, 2, . . .T where T is a fixed number of time periods.The rewards are maximized by the best sequential decisions over time, making MDPs a dynamic optimization tool as used in Blanchet (2016) [15] to identify the right choices of substitution behaviors of consumers.Let s t ∈ S be the states occupied at time t, r t (s t ) which is the reward associated with s t , and d t (r t , s t ) is the decision based on the possible rewards and states at time t.The decision process maps the movement from one state to another over time t based on rewards received and an optimal decision set.As the decision process is Markovian, the transition probability to the next state, S ′ = S t+1 based solely on the decision made at the current state, S = S t is P SS ′ = P[S t+1 = S ′ |S t = S], where t = 1, . . ., T. The conditional probability of state transition from S to S ′ is defined as: The transition probability of state s ′ is not homogeneous.As described in Equation ( 8), it depends on the action taken in the state S. Factors such as 'feeling' and 'uncertainty' of an individual can be highly affected to this probability.Further, the human behavior has a tendency to select options that have immediate gratification or those with the nearest range rather than selecting alternatives further away in time.Those are some of challenging constraints in terms of building a transition probability function.Further, the dependent structure of adjacent time probabilities needs to be handled carefully.Based on McFadden (1974) [12], the conditional logit model on the error term, Working et al. (2018) [7] establish a time-dependent probability formula while avoiding the IIA condition.It is reasonable as an initiation, but the dependent structure of the choice at the next time point or the above-mentioned 'close at hand' constraint cannot be seen in that transition matrix.In this manuscript, we introduced copula-based distribution with a CUB marginal, which is a combination of a discreet uniform and shifted binomial distributions introduced by Piccolo (2003) [16].This model is a totally new approach for discrete choice modelling.

Copula Methods
Many stochastic models define the relation between X and Y through expectations, E(Y|X = x) = α + βx + ε.Copulas are functions that connect multivariate distributions to their one-dimensional margins.If F is an m-dimensional cumulative distribution function with one-dimensional margins, F 1 , . . ., F d , then there exists an m-dimensional copula C such that F(y 1 , . . ., y d ) = C(F 1 (y 1 ), . . . ,F d (y d )).The case d = 2 has attracted special attention; Trivedi and Zimmer (2005) [17].Copula modelling demonstrates that practical implementation and estimation is relatively straightforward despite the complexity of its theoretical foundations.The copula approach for constructing joint distributions has gained popularity in recent years in applied research in finance, civil engineering, medicine, climate, and weather research and many more.Copula models have been widely used for modelling and characterizing dependence structure in multivariate data.The extension to handle discrete and continuous forms of data offers contingency to model the probability of best-worst pair selection probability.

CUB Probability Model
Piccolo (2003) [16], Piccolo and Simone (2019) [18], and D'Elia and Piccolo (2005) [19] introduce a radical model which is a Combination of a discrete Uniform and a shifted Binomial distribution (CUB).CUB jointly considers two latent components, called 'feeling' and 'uncertainty'.General references for the statistical background for CUB are included in Iannario (2012) [20] and Iannario and Piccolo (2010) [21].The stingy parameterization and the ease of estimation and interpretation make CUB models a very useful tool for choice data analyses.Piccolo and Simone (2019) [18] asserts that there is a possibility to fit statistical models to ordinal/choice data without the need of explanatory covariates.Our interest is to use CUB marginals for copula distribution to construct a transition probability matrix which can be used to identify human choice behavior in future time points, given the present state of choice.An interesting attempt at defining a bivariate CUB distribution using a multivariate model with fixed margins is made by Andreis and Ferrari (2013) [22].The CUB is a class of mixture models, possibly involving covariates, developed as a new approach for modelling discrete choice processes.CUB models consider two components, which are the 'uncertainty' and 'feeling' of humans when making a choice.The intrinsic 'uncertainty' in choosing an item is modelled through a discrete uniform variable.The latent process leading to the choice is directed by the subjective 'feeling' and modelled using a shifted binomial distribution.The probability of observing a particular response r = 1, 2, . . .m, with m known.Assuming m > 3 to ensure identifiability, the mixture probability model is described in Piccolo (2003) [16] and D'Elia and Piccolo (2005) [19].It is given as: with π ϵ (0; 1] and ξ ϵ (0; 1]. Here, π defines the weights of discrete uniform shifted binomial and it is inversely related to the amount of uncertainty in the choice decisions.The ξ is related to personal preferences and measures the strength of feeling.In CUB models, there is non-uniqueness of the copula representation and the non-uniqueness stems from the fact that marginal distribution functions, which are not strictly monotonically increasing, rather monotonically non-decreasing, do not possess an inverse in the usual sense, rather a pseudo-inverse; Nelsen (2007) [23], Genest et al. (2007) [24].Because of that reason, we must be precisely cautioned when handling copulas with non-continuous margins, which is in the CUB model.

The CO-CUB Model (Copula-Based CUB)
Andreis and Ferrari (2013) [22] defines a multidimensional extension of CUB in Equation (9), called CO-CUB model, as a multivariate copula with discrete margins, each following a CUB distribution.A k-dimensional (k ≥ 2) CO-CUB model with copula C is a multivariate discrete variable with margins R i ∼ CUB(π i , ξ i ), i = 1, . . ., k, each with support {1, . . . ,m i }, m i > 3. Joint distribution function is given by: where π = (π 1 , . . . ,π k ) ′ and ξ = (ξ 1 , . . . ,ξ k ) ′ as in Equation ( 6), for a particular choice of copula C, characterized by a parameter α = (α 1 , . . . ,α d ) ′ taking values in some real d-dimensional space α defining the dependence structure of its components.F i (r i ; π i , ξ i ) stands for the distribution function of the i th margin; that is, F i (r i ) = P(R i ≤ r i ) and the support of the CO-CUB variable is the grid {1, . . . ,m 1 } × . . .× {1, . . . ,m k }.The whole parameter set for a k-dimensional CO-CUB is, then, the ordered triplet π, ξ, α ϵ [0, 1] k × [0, 1] k × α, having the following interpretation: by the definition of copula, being margins of CUB, parameters π, ξ hold the same interpretation as in the unidimensional case, while for what concerns the copula parameter α, its interpretation as a dependence measure relates to the specific copula C adopted in Equation (10).Andreis and Ferrari (2013) [22] justified that the Placket copula distribution is one of the better candidates for the CO-CUB model.

The CO-CUB Model with Plackett Copula without Covariates
The Plackett copula is defined as: , for α ∈ (0, 1) ∪ (1, +∞), Here, , where m 1 is the (fixed) number of categories of the first CUB model.This is similar for v π 2 ,ξ 2 and we will from now on assume m 1 = m 2 = m > 3. The copula probability mass function can be obtained as Model parameters are then π 1 , ξ 1 , π 2 , ξ 2 and α.Estimation can be performed by inference for the margin method (IFM), as shown in Joe and Xu (1996) [25].

Setting up the Transition Probability Matrix
Let F(x, y) = C(x, y) be the cumulative function and For the notation convenient, let the choice state at time point t + 1, s i(t+1) := s ′ i and the state at t, s i(t) := s i , where s ′ i , s i are ordered labels (ranks) of the BW choice pairs and s ′ i , s i = 1, 2, . . ., τ; i = 1, 2, . . ., G, and time t = 1, 2, . . ., T. The transition probability is denoted as P t iss ′ = P t s ′ i s i .Since we do not have publicly available data from Flynn et al. (2007) [6] experiment, we have to simulate data under the paired model and distributed them among different choice sets.We simulated a number of different datasets under the same set of estimated regression parameters to prioritize choice pairs based on their popularity.That is, for a given choice set C, we counted the number of times an individual picked the choice pair x ij , x ij ′ .Finally, all the choices are arranged according to their popularity; that is, from the highest count to lowest count.This proceeding gave elements in a choice set, an ordinal power and it is a necessary requirement to incorporate the CUB model for choice data.Further, we adjusted the function for the repetition of selections in order to keep the transition probability formula as a probability distribution function.Then, the transition probability formula can be written as where n is the number of repetitions for a selection of the pair x ij , x ij ′ in the selected choice set and C p = ∑ N r=1 c(u = g, v).I(c(u = g, v)), c is the density function of the Placket copula as shown in Equation (11).Let N be the size of the random sample, g, h = 1, 2, . . ., τ, and τ be the number of BW choice pairs considered in each choice set.Then, the probability transition matrix for the i th choice set is in the form of P iss ′ is denoted as the transition probability when a consumer changing his choice selection state s at time t to s ′ at time t + 1.For all the transitions, ∑ τ s ′ =1 P iss ′ = 1 (e.g., the sum of the row values in Equation ( 13) is 1) and if there are no possible transitions ∑ τ s ′ =1 P iss ′ = 0.The ordering scheme is such that for s ′ i − s i ≤ ω, ∑ P ss ′ > ∑ P ss ′′ for any other s ′′ ̸ = s ′ for the i th choice set, where ω is some small-time epoch number.That is, in simple words the row sum of the probability of the region s ′ i − s i ≤ ω is the highest portion than any probability sum in the same row of the probability transition matrix in the i th choice set.We introduce a constraint s ′ i − s i ≤ ω for our copula model because the nature of human behavior is that they have a greater tendency to grab options that are close at hand or those within the nearest range rather than picking up alternatives further away in time.The data were simulated based on that choice behavior assumption; that is, s ′ i − s i ≤ ω.The transition matrix may be either stationary or dynamic in nature.In this manuscript, we considered only the stationary case.It is important to state that the transition probability matrix is calculated by considering all N individuals in the experiment.The row sum of the transition matrix is kept as 0 if there is no transition from s to s ′ .These proprieties can be seen in Tables 4 and 5.

Utility in Time with Dynamic Programming
In this part, we incorporate MDPs and Dynamic Programming to calculate utility in time with the help of transition probabilities.Bellman (1954Bellman ( , 1956) ) [26,27] introduced a dynamic programming technique to evaluate the utility value function, also known as Bellman's equation.At each time step, dynamic programming uses numerical methods to evaluate the value function moving backwards in time.Rust (1994Rust ( , 2008) ) [28,29] and Ellickson (2011) [30] presented a numerical method with steps for evaluating DCEs as MDPs.
The value function for the DCEs is defined by at each time t as: with P ss ′ = P(s t+1 |s t ) = P(s t+1 = s ′ |s t = s).This may also be defined in terms of the transition in decisions as P(d t+1 |d t , x t ),where d t ∈ D is the decision at time t, U(x t , d t ) is the derived expected utility, and ε is the associated error term at time t, where t = 1, 2, . . ., T. The discount utility rate in Equation ( 14) is given by γ ∈ (0, 1); see Feinberg and Schwartz (1994) [31].There exist τ value functions for each of the τ alternatives in the experiments evaluated at each time point t = 1, 2, . . ., T. The sum goes from t up to T because it is evaluated using a backwards recursive method; that is, we start at the last time point and work our way backwards to earlier time points.
where U t is utility in time (UiT) and V t is the value function.Based on Working et al. (2018) [8], the initial utility for the selected choice pair can be calculated using: where βA , βA k are estimated parameters for best attribute and attribute level and βA ′ , βA ′ k are the estimated parameters for worst attribute and attribute level, respectively.Instead of this simple model, in this manuscript we used the paired model which is one of the more appropriate models for the paired data.In fact, to distribute the utility over time for the future time points, further adjustment is needed for the value function.We adjusted the value function in Equation ( 12) for the time in such a way that M is the design matrix under the paired model which contains +1, −1, and 0 described in Equations ( 3)-(5).We broke M into M + and M − , which contains +1s and −1s.Further, we introduce arbitrary time-sensitive constants θ t b , θ t w to make the initial systematic component vary with time.

Application and the Design of the Experiment
A pilot best-worst study was conducted in summer 2005 among people aged 65 and over with the aim of informing a larger quality of life valuation exercise by Flynn et al. (2007) [6].In this survey, there are K = 5 attributes, which are Attachment, Security, Role, Enjoyment, and Control.Each attribute with l k = 4 levels.That is Attachment: Attach_none, Attach_little, Attach_lot, Attach_all Security: Security_none, Security_little, Security_lot, Security_all Role: Enjoyment_none, Enjoyment_little, Enjoyment_lot, Enjoyment_all Enjoyment: Role_none, Role_few, Role_many, Role_all Control: Control_none, Control_few, Control_many, Control_all We simulated a new data set for 100 individuals and estimated regression parameters by using a conditional logit model with the help of R package 'support.BWS2'; see Aizaki and Fogarty (2019) [11].The application is from a balance design where all five attributes have four-levels each; there are 320 unique BW choice pairs.We simulated data in such a way that 16 choice sets were used to cage all 320 unique BW choice pairs (see Equation ( 1)), by keeping each choice set with 20 choice pairs.Each person of the 100 sample should make a choice in each choice set. Figure 2 shows the orthogonal design we used to simulate the data.The shape of the questionnaire design based on the orthogonal array is shown in Figure 3.Such a questionnaire could be used in the case of a real-life data collection process.Table 1 shows the design matrix for the paired model.The attribute 'Control' is omitted and the fourth level in each attribute ('Attach_all', 'Security_all', 'Enjoyment_all', 'Role_all' and 'Control_all') is considered to be the base level.
Stats 2024, 7, FOR PEER REVIEW 13 Security: Security_none, Security_little, Security_lot, Security_all Role: Enjoyment_none, Enjoyment_little, Enjoyment_lot, Enjoyment_all Enjoyment: Role_none, Role_few, Role_many, Role_all Control: Control_none, Control_few, Control_many, Control_all We simulated a new data set for 100 individuals and estimated regression parameters by using a conditional logit model with the help of R package 'support.BWS2'; see Aizaki and Fogarty (2019) [11].The application is from a balance design where all five attributes have four-levels each; there are 320 unique BW choice pairs.We simulated data in such a way that 16 choice sets were used to cage all 320 unique BW choice pairs (see Equation ( 1)), by keeping each choice set with 20 choice pairs.Each person of the 100 sample should make a choice in each choice set. Figure 2 shows the orthogonal design we used to simulate the data.The shape of the questionnaire design based on the orthogonal array is shown in Figure 3.Such a questionnaire could be used in the case of a real-life data collection process.Table 1 shows the design matrix for the paired model.The attribute 'Control' is omitted and the fourth level in each attribute ('Attach_all', 'Security_all', 'En-joyment_all', 'Role_all' and 'Control_all') is considered to be the base level.The parameter estimated by Flynn et al. (2007) [6] and conditional logit regression estimates for the simulation are presented in Table 2. Based on Equation (17), systematic components for each choice and then the corresponding utility can be calculated.Estimations for base levels are calculated by  = − ∑  as mentioned in Section 2.1.1.As per the iterative dynamic programming method introduced in Equation ( 15), with initial utility as described in Equation ( 16), utility values for a consecutive 10 time periods are calculated recursively by utilizing the transition probabilities and a discount factor γ = 0.95.As a demonstration, the results for choice set 1 is presented in this manuscript.Security: Security_none, Security_little, Security_lot, Security_all Role: Enjoyment_none, Enjoyment_little, Enjoyment_lot, Enjoyment_all Enjoyment: Role_none, Role_few, Role_many, Role_all Control: Control_none, Control_few, Control_many, Control_all We simulated a new data set for 100 individuals and estimated regression parameters by using a conditional logit model with the help of R package 'support.BWS2'; see Aizaki and Fogarty (2019) [11].The application is from a balance design where all five attributes have four-levels each; there are 320 unique BW choice pairs.We simulated data in such a way that 16 choice sets were used to cage all 320 unique BW choice pairs (see Equation ( 1)), by keeping each choice set with 20 choice pairs.Each person of the 100 sample should make a choice in each choice set. Figure 2 shows the orthogonal design we used to simulate the data.The shape of the questionnaire design based on the orthogonal array is shown in Figure 3.Such a questionnaire could be used in the case of a real-life data collection process.Table 1 shows the design matrix for the paired model.The attribute 'Control' is omitted and the fourth level in each attribute ('Attach_all', 'Security_all', 'En-joyment_all', 'Role_all' and 'Control_all') is considered to be the base level.The parameter estimated by Flynn et al. (2007) [6] and conditional logit regression estimates for the simulation are presented in Table 2. Based on Equation (17), systematic components for each choice and then the corresponding utility can be calculated.Estimations for base levels are calculated by  = − ∑  as mentioned in Section 2.1.1.As per the iterative dynamic programming method introduced in Equation ( 15), with initial utility as described in Equation ( 16), utility values for a consecutive 10 time periods are calculated recursively by utilizing the transition probabilities and a discount factor γ = 0.95.As a demonstration, the results for choice set 1 is presented in this manuscript.The parameter estimated by Flynn et al. ( 2007) [6] and conditional logit regression estimates for the simulation are presented in Table 2. Based on Equation (17), systematic components for each choice and then the corresponding utility can be calculated.Estimations for base levels are calculated by A jl k = − ∑ k−1 i=1 β A ji as mentioned in Section 2.1.1.As per the iterative dynamic programming method introduced in Equation ( 15), with initial utility as described in Equation ( 16), utility values for a consecutive 10 time periods are calculated recursively by utilizing the transition probabilities and a discount factor γ = 0.95.As a demonstration, the results for choice set 1 is presented in this manuscript.
The BW choice pairs are arranged in order, as mentioned in Section 3.4.We ran the choice selections algorithm 10,000 times within each choice set to get a proper ordering of elements.Table 3 shows the arrangement for the BW pairs for choice set 1.This arrangement is organized according to the popularity (highest number of picks) of the BW choice pairs.That information is then used to calculate the transition probabilities at the next time point.The probability transition matrix can be calculated utilizing our new model as derived in Equation (12).It is a 20 × 20 matrix since each choice sets contain 20 choice pairs.Figure 4 shows the highest probability corresponding to each BW choice pair in choice set 1 for three different samples sizes, which n = 100, n = 500, and n = 1000; that is, the highest probable transition slot at time t to t + 1. Tables 4-6 show the corresponding transition probability matrices for choice set 1 for sample sizes 100, 500, and 1000, respectively.The constraint region, |s i ' − s i | ≤ 3, is highlighted in the probability transition matrix and the highest probability is squared.Nobody picked choice pairs 11, 12, 15, 19, and 20 at time t or t + 1 when n = 100.That has happened only to choice pair 19 in the n = 500 scenario.However, for n = 1000 all the choice pairs have been selected.Table 7 shows the utilities (UiT) calculated for time points t = 1, 2,. .., 10. Figure 5 shows the utility-time distribution of selected choices pairs (2, 7, 14, 16, and 18) in choice set 1. It highlights the dynamic utility in time.
The BW choice pairs are arranged in order, as mentioned in Section 3.4.We ran the choice selections algorithm 10,000 times within each choice set to get a proper ordering of elements.Table 3 shows the arrangement for the BW pairs for choice set 1.This arrangement is organized according to the popularity (highest number of picks) of the BW choice pairs.That information is then used to calculate the transition probabilities at the next time point.The probability transition matrix can be calculated utilizing our new model as derived in Equation (12).It is a 20 × 20 matrix since each choice sets contain 20 choice pairs.Figure 4 shows the highest probability corresponding to each BW choice pair in choice set 1 for three different samples sizes, which  = 100,  = 500, and n = 1000; that is, the highest probable transition slot at time t to t + 1. Tables 4-6 show the corresponding transition probability matrices for choice set 1 for sample sizes 100, 500, and 1000, respectively.The constraint region, |si' − si| ≤ 3, is highlighted in the probability transition matrix and the highest probability is squared.Nobody picked choice pairs 11, 12, 15, 19, and 20 at time t or t + 1 when n = 100.That has happened only to choice pair 19 in the n = 500 scenario.However, for n = 1000 all the choice pairs have been selected.Table 7 shows the utilities (UiT) calculated for time points t = 1, 2,…, 10. Figure 5 shows the utility-time distribution of selected choices pairs (2, 7, 14, 16, and 18) in choice set 1. It highlights the dynamic utility in time.

Conclusions
In general, the choices are made at different time points and are not IIA for any individual.They depend on not only qualitative factors such as priority setting, feeling, and uncertainty, but also quantitative covariates such as measure of sequential selection

Conclusions
In general, the choices are made at different time points and are not IIA for any individual.They depend on not only qualitative factors such as priority setting, feeling, and uncertainty, but also quantitative covariates such as measure of sequential selection probability in the process.By considering a specific popular model, we capture and monitor the dependent structure and calculate the transition probability between consecutive choices at adjacent time points.We were able to include parameters like feeling and uncertainty to the model.This new approach to calculating transition probabilities extend the ideas proposed by Piccolo (2003) [16] and Piccolo et al. (2019) [32].Further, we could justify our initial assumption of the human choice behavior of s ′ i − s i ≤ ω, that they have a greater tendency to grab options that are close at hand or those within the nearest range rather than picking up alternatives further away in time.
Data to quantify the BW in the DCE models are mainly found outside of academia.In this paper, we have simulated the data.Such an issue is one limitation of the paper, but the approach has the advantage of bringing and completing the ranking.The paper could also gain if more covariates were given.Based on Flynn's 'Quality of life experiment', we choose value ω as 3.The accuracy of the CO-CUB model can be further improved relaxing that constraint and by adding more latent covariates and updating knowledge of the ranking labels of the choice pairs within the choice set.We leave such ideas for future research.

Figure 2 .
Figure 2. Orthogonal design for Flynn's experiment flow chart of the 5 attributes, each with 4 levels, with 16 choice sets: e.g., A4 is the 1st attribute at the 4th level.

Figure 3 .
Figure 3. Questionnaire design based on the Orthogonal Array.

Figure 2 .
Figure 2. Orthogonal design for Flynn's experiment flow chart of the 5 attributes, each with 4 levels, with 16 choice sets: e.g., A 4 is the 1st attribute at the 4th level.

Figure 2 .
Figure 2. Orthogonal design for Flynn's experiment flow chart of the 5 attributes, each with 4 levels, with 16 choice sets: e.g., A4 is the 1st attribute at the 4th level.

Figure 3 .
Figure 3. Questionnaire design based on the Orthogonal Array.

Figure 3 .
Figure 3. Questionnaire design based on the Orthogonal Array.

Figure 4 .
Figure 4. Most probable transition slots from time  to  + 1.Figure 4. Most probable transition slots from time t to t + 1.

Figure 4 .
Figure 4. Most probable transition slots from time  to  + 1.Figure 4. Most probable transition slots from time t to t + 1.

Figure 5 .
Figure 5. Utility-Time graph for some selected BW choice pairs in choice set 1.

Figure 5 .
Figure 5. Utility-Time graph for some selected BW choice pairs in choice set 1.
Author Contributions: Conceptualization, S.A. and N.D.; methodology, S.A. and N.D.; software, S.A. and N.D.; validation, S.A. and N.D.; formal analysis, S.A. and N.D.; investigation, S.A. and N.D.; data curation, S.A. and N.D.; writing-original draft preparation, S.A. and N.D.; writing-review and editing, S.A. and N.D.; All authors have read and agreed to the published version of the manuscript.Funding: This research received no external funding.

βFlynn SE Flynn βSim SE Sim Parameters βFlynn SE Flynn βSim SE Sim
βFlynn -Estimated parameters from Flynn's experiment.βSim -Estimated parameters from the simulation.The * represent the base levels in our simulated data.

Table 3 .
Arrangement of BW choice pairs in choice set 1, based on their popularity.

Table 7 .
Expected utilities in time (UiT) for choice set 1.

Table 7 .
Expected utilities in time (UiT) for choice set 1.