Competitive Pricing for Multiple Market Segments Considering Consumers’ Willingness to Pay

: Deﬁning prices and in which consumers’ segments to put the company’s efforts within competitive markets selling bundles is challenging. On the one hand, methodologies focused on competition are usually appropriate for analyzing market dynamics but not for helping decision makers in speciﬁc tasks regarding pricing. On the other hand, simplistic cost-oriented methods may fail to capture consumer behavior. We see these characteristics in such markets as telecommunications, retail, and ﬁnancial service providers, among others. We propose a framework to support pricing decisions for products with multiple attributes in competitive markets, considering consumers’ willingness to pay and multiple segments. The proposed model is a nonlinear proﬁt maximization probabilistic problem. We represent the demands for products and services through a multinomial logit model and then include consumers’ maximum willingness to pay through soft constraints within the demand function. Since the proﬁt function is non-concave, we deal with the nonlinearity and the multiple optima to solve the model through an equivalent nonlinear model and a particle swarm optimization (PSO) heuristic. This setting allows us to ﬁnd the prices that achieve equilibrium for the game among the ﬁrms that maximize their proﬁts. Including the features shown, our approach enables decision makers to set prices optimally. Estimating the parameters needed to run our model requires more effort than traditional multinomial approaches. Nevertheless, we show that it is essential to include these aspects because the optimal prices are different from those obtained with more simpliﬁed models that do not have them. Additionally, there are well-established methodologies available to estimate those parameters. Both the determination of the ﬁrst-order optimality conditions and the PSO implementation allow to ﬁnd equilibria, quantify the effect of the consumers’ maximum willingness to pay, and assess the competition’s relevance. As complementary material, we analyze a case from a Chilean telecommunications company and show the results regarding price decisions and market share effects. According to our literature review, these aspects have not been handled and quantiﬁed jointly, as we do to support pricing.


Context
Pricing is perhaps one of the most important tasks when decision makers try to maximize firms' profit. To make this process more precise and useful, firms should consider the most relevant aspects, such as consumers' preferences [1][2][3], production costs [4][5][6], company characteristics [7][8][9][10], and whether the market is competitive or not [2,[11][12][13][14], among others. In our context, we would like to focus on useful tools for developing pricing models, such as, for example, discrete choice models, market segmentation, and competition analysis. In our opinion, there are relations among these topics and specific aspects that require more attention and research effort. We refer specifically to modeling the response • Our model considers competition, in which firms react to others' price changes. With our approach, we are able to estimate the difference between the situation with and without competition, evidencing how valuable this aspect's modeling is. • We consider multiple consumers' segments. This aspect makes the model more realistic and useful for decision making. The profit function to price multiple products or services for a single consumer's segment is concave [15], but for the multiple segment, it is not. To deal with this challenge, we design a solution strategy based on particle swarm optimization [26,27] (PSO) to handle the problem. PSO is a well-known and useful alternative to deal with complex real-world problems [28][29][30]. • In competition, the costs: We highlight the relevance of estimating the cost of competitors, which leads to a better determination of prices and to analyzing the strategy regarding the consumers' segments to focus on.
• Consumers' willingness to pay is a relevant factor that changes the price equilibrium. Our proposal allows us to quantify this effect on prices and helps us to make better pricing decisions in the presence of multiple segments. • We model rational consumers (demand) through a constrained multinomial logit [31]. The utility function considers the consumers' valuation of the firm, prices, and attractiveness of products. • Real case analysis: As an example of our proposal's potential in a real environment, we analyze a case from a Chilean telecommunications company. We quantify how competition and wp affect prices.

Literature Review
We focus on the relevant topics for our research and highlight the literature gaps we would like to bridge. This section begins with pricing, multiple segments, and competition; then, we expose a review regarding discrete choice models and consumers' willingness to pay.

Pricing, Multiple Segments and Competition
Intensive research in the general topic of price competition has been developed in economics, starting with Bertrand's model [32], followed by another of the most relevant authors in this area, namely, Edgeworth [33], Hotelling [34], Chamberlin [35], among many others.
Pricing and competition have also been analyzed in operations management-related decisions and management science in more recent works. For example, Li et al. [36] analyzed sales quantity competition among multiple companies by mixing laboratory experiments and theoretical analyses. They derived some equilibrium properties that, in time, explained the experimental data. Mesak et al. [37] proposed an inventory-advertising model with inventory competition for two firms; as a part of their work, the authors modeled an extension to pricing competition.
Bitrán and Ferrer [5] and Pérez et al. [15] studied pricing and the analysis of products with multiple attributes (bundling); these topics are the base for this research. In a recent paper, Kopczewski et al. [38] analyzed the topic of competitive bundling and pricing based on the classic bundling price diagrams introduced by Adams and Yellen [39]. They studied the effect on the profitability of bundling strategies of the following three aspects: consumers' reservation prices, the complementarity of the demand, and economies of scale and scope. For a broader survey on pricing, bundling, and composition, see Pérez et al. [15] and Bitrán and Ferrer [5].
There has also been recent and intense research activity on pricing and logit models with applications. For example, Chen and Jiang [40] proposed a price optimization problem with a nested logit model to maximize retailers' profits considering capacity limits. Wang and Zhou [41] used a multinomial logit to represent consumers' preferences in a problem to define price and stock for a retailer selling fresh agriproducts. Zambrano-Rey et al. [42] developed a model to optimize prices and locations of retail stores that modeled the demand with a constrained multinomial logit and solved the location and pricing model with particle swarm optimization. Li et al. [21] analyzed a product-line price problem modeling the demand with a mixed multinomial logit and multiple market segments.
Researchers have spread their efforts on the topic of price competition into different paths. One of them, and the relevant one for this research, is considering logit demand, which has been widely studied over the last years. For example, Gallego et al. [17] described the conditions for the existence and uniqueness of a price equilibrium in a Bertrand oligopoly price competition scheme, applying a generalization of the logit demand model. Li and Huh [43] compared solutions for optimal pricing in monopoly against oligopolistic price equilibrium for multiple differentiated products using multinomial logit. In Kök and Xu [44], the authors studied the problem with a nested logit approach to study the effect of explicitly considering the choice process of the brand and product, concluding that optimal pricing results differ depending on when in the choice process customers select the product, either before brand selection or after. Aksoy-Pierson et al. [2] postulated a class of price competition models considering a mixed multinomial logit demand and characterized the equilibrium when an independent and separate firm sells each product. Gallego and Wang [19] analyzed multi-product oligopolistic price competition considering nested logit demand, including product-dependent price-sensitivity parameters. More recently, Besbes and Sauré [20] studied assortment and price competition using a multinomial logit demand. The authors studied two cases: one in which prices are exogenously fixed and firms compete only in assortments; and secondly, when retailers can modify price and assortment.
The simplest approach to represent consumers' preferences is to consider the deterministic (and sometimes linear) demands [6,45]. Some other approaches, such as those mentioned in the paragraph above, consider probabilistic demands by assuming some of their parameters to be random variables. A common and useful alternative is to assume that demand has two parts: a deterministic and a random one. When the last one is distributed as a Gumbel, it configures the classical logit approach [46]; we consider this to be quite a popular approach.
Some linear and deterministic approaches to consider customers' willingness to pay in pricing competition models, such as those derived from Bolton's proposal in Bolton [45], are quite interesting since they include a consumers' budget constraint explicitly to a principal-agent model. With this approach, the feasible domain of prices shrinks to the price range between costs and the consumers' willingness to pay. However, the approach to bound prices has been modified by some authors by considering a soft constraint over the price when including the effect of the willingness to pay, which may be considered as a more realistic approach than the one proposed by Bolton, but the inclusion of this way of representing wp in a context where the consumers' preferences are explicitly considered as a soft constraint may be mathematically complex in a principal-agent approach.
The approach of Bitrán and Ferrer [5] includes the pricing and composition of bundles, but competition is not explicitly considered since there was only one firm in their analysis, and there was no direct price response of other firms in the context of the multinomial logit approach they considered. In the same line, in the extension proposed by Pérez et al. [15], even though it considered constrained pricing with a wp, the price competition was again not analyzed.
Our research focuses on competitive pricing in a market with multiple market segments and consumers' willingness to pay. We model the demand through a constrained multinomial logit [31] to represent consumers' preferences and the wp constraints.

Consumers' Discrete Choice Models and Consumers' Willingness to Pay
The consumers' discrete choice of a product or service is the decision-making process conducted by a consumer when selecting to buy from a given set of discrete alternatives available on the market by evaluating the attributes of products and their prices. Consumers narrow the set alternatives by considering endogenous and exogenous factors only to those feasible and known by them when the purchase decision is made. Manski [47] defined discrete choice models to represent the decision-making process driven by a consumer when deciding among alternatives. This author also stated, perhaps, the most important assumption behind this process, the consumers' rationality, where they prefer and select the choice with the highest valuation according to their preferences. This process may be deterministic, but we will focus on random utility models since it is realistic to assume that there is no complete information about consumers' preferences. They have to be estimated, for example, through probability approaches, such as the one analyzed in Block and Marschak [48]. Among the random-models, one widely adopted is the multinomial logit model (MNL) [46], which considers that the consumer's utility is random, and it has two terms: one that is deterministic and depends on price and the attributes of the discrete set of alternatives; and another that is an error assumed as a Gumbel distribution [49][50][51].
The MNL model is considered a satisfactory approximation to the consumers' valuation when purchasing goods or services. According to Schmalensee [52] and Schmalensee and Thisse [53], the MNL is more flexible than any other deterministic approach to the decision-making process, given that it is based on microeconomic theory and considers the consumers' utility maximization. There are several theoretical and applied research papers in operations research and management science applying MNL to represent the demand for optimization problems. Some authors find the optimal price and other decisions, such as facility location, production, inventory, and transportation, among others. For example, in studies by Zhang [10], Luer-Villagra and Marianov [54], the authors proposed facility location problems, in which non-linear optimization models were presented and solved through meta-heuristics methods to find near-optimal locations and prices. Aydin and Porteus [8] set the optimal inventory levels and prices. Similarly, Yaghin et al. [55] proposed optimal product distribution strategies and pricing in a two-echelon supply chain. Shao [56] proposed a model that deals with the optimal design of a product and its pricing. Rezapour and Ksaibati [57] improved the use of models in the MNL family to deal with the scale factor and obtain a better fit.
There are not only preferences, but also some constraints that may affect the consumers' choice. Perhaps the most important element is the maximum willingness to pay, which is valid even if the consumer has a high valuation of any of the alternatives since some constraints are able to potentially affect their willingness to pay, such as a maximum budget that cannot be surpassed. A classic and strict way to understand a constraint of this type is, for example, the approach of Bolton [45], in which if the price is slightly higher than the budget limit, then the customer will not select the alternative. Nevertheless, there are some other points of view regarding this topic, where the maximum willingness to pay is understood as a soft limit for the customers' choice (or not that strict). In particular, some researchers on marketing and business empirically analyzed this assumption regarding the wp [58,59]. Wang et al. [60] offered insights regarding the importance of wp, indicating that consumers show uncertainty concerning their wp, and they proposed to model it as distributed according to a probability range that has an incidence on consumers' choice. A wp interval may be also understood as a variance in the individual choice centered on an individual wp [58,61]. Dost et al. [59] analyzed the consumers in ranges that define their wp, and they applied a piecewise linear decreasing function to represent it. Dost and Geiger [62] extended the model of [59] by implementing a differentiable function to separate consumers according to their wp, which may be seen as a first approach to use the constrained multinomial logit model (CMNL) [31], but even though it is conceptually similar, it does not have the same mathematical formulation.
A general assumption of the multinomial logit model is that the customers have a compensatory behavior; this is, they may balance their choice to maintain a utility level by selecting among the attributes of the product, for example, by trading off between quantity and price. This compensatory behavior is not always realistic because sometimes products are potentially bounded in continuous attributes, and are complicated to handle when the attributes are discrete or categorical ( [63]). The inclusion of constraints regarding customers' utility maximization problem turns complex in pricing optimization [10,64].
There are some approaches to model constraints over features to restrict candidate solutions within customer-feasible alternatives from the perspective of discrete choice models. Let us take a closer look at the options and applications proposed by some authors in this regard: • Swait [65] represented the customer's choice as an optimization problem, where they choose the alternative that maximizes her/his utility, subjected to threshold-like restrictions over the features. The model allows threshold violations on the features and these are reflected in the utility level as costs, introduced as penalization functions or cut-off to the utility function. • Cantillo and Ortúzar [66] introduced a two-stage semi-compensatory model, considering that the probability that one alternative is included in the choice set is able to be represented as the probability that all its features are not over a certain threshold, which is dependent on a specific customer. If two alternatives are selected in the first stage, a compensatory behavior is assumed in the second one. The authors concluded that if there is empirical evidence on these thresholds' existence, then the implementation of a compensatory model produces biased parameter estimation. • Based on Swait [65] and Martínez et al. [31], Cascetta and Papola [67], one can consider the feasibility of an alternative for the consumers by assuming that their utility splits into a compensatory and a non-compensatory term. In order to achieve this objective, the authors introduced cut-off or penalization functions for each feature. These penalization functions allow a smooth transition between the feasible compensatory product space, and the non-feasible non-compensatory space, permitting the constraints to be mildly violated by the decision-maker agent. • Castro et al. [68] studied parameter estimation under the CMNL. The authors concluded that the CMNL is an adequate model to describe customer's behavior in some applications since it portrays a better adjustment than the MNL under the assumption of bounded rationality and replicates MNL estimations under the unrestricted case. • There are recent applications of the CMNL in diverse areas, such as transportation mode selection [68], schools [69], and urban economy [70,71], among others. • As mentioned in Dost and Geiger [62], it is necessary to include in decision-making models the customers' maximum willingness to pay when designing a strategy of pricing of products and services.
We organized the paper into the following sections: 1. Introduction: We included the description of the context, the decision problem, the research highlights, and the literature review.

2.
Material and methods: Contains the model description. We also present the PSO strategy to find the equilibrium.

3.
Results: We show numerical experiments for illustrative instances and a real case from a Chilean telecommunications company.

4.
Conclusions: Contains a discussion of the scope of the research and its conclusions.

Materials and Methods
We propose a mathematical model to support pricing decisions for products with multiple attributes in competitive markets, considering consumers' willingness to pay and multiple segments. We represent the demands by modeling the consumer behavior for products and services through a multinomial logit model and then include consumers' maximum willingness to pay through soft constraints within the demand function. The proposed model is a nonlinear profit maximization problem. We start by describing the consumer behavior model, then the mathematical optimization model, and finally, the solution strategy.

Consumer Behavior
The consumer behavior or demand function is based on the stochastic discrete choice model called constrained multinomial logit. This model is an econometric tool based on the theory of rationality where each consumer chooses a product that generates the greatest utility, among a finite set of alternatives. In mathematical terms, U s,k i is defined as the utility of product i ∈ N k of firm k ∈ K for consumers in socioeconomic segment s. That utility is known by the consumer. In random utility models, the probability that the consumer of segment s chooses product i ∈ N k of firm k ∈ K is q s,k i = P(U s,k i > U s,l j , ∀l ∈ K, ∀j ∈ N l ) In these models, U s,k i is considered to be known by the consumer, but not by the modeler or data analyst. This utility is represented as the sum of two components: A deterministic part V s,k i is known by the modeler and is a function of the vector of attributes that define the product i, and another random component ξ s,k i represents the error. If it is assumed that the errors ξ s,k i follow an independent and identically distributed (iid) Gumbel, then the probability of choice (1) has a multinomial logit functional form: where γ s = e V s,0 and V s,0 is the non-purchase utility of consumers in segment s. The following formulation is proposed for V s,k i : where • β s < 0 is the marginal rate of substitution of income for price in segment s. In other words, it is the importance that each segment assigns to the price of the product. • I s,k 0 is the attractiveness of firm k for consumers in segment s. That is the preference with respect to each firm or brand. • I s,k i is the total attractiveness of product i of firm k for consumers in segment s. This attractiveness is obtained as a combination of the valuation of the attributes of each product in each firm.
An assumption of (4) is that consumers follow a compensatory behavior, that is, a level of utility of a product can be fixed, increasing its price and attractiveness simultaneously. However, in many contexts, consumers have bounds or thresholds with respect to price (or other attributes that define the product), making this compensatory assumption not feasible. In particular, consumers in segment s cannot spend more than their maximum willingness to pay wp s . Hence, a feasible product for segment s must meet p k i < wp s , which is the hard version of the constraint. However, these constraints are considered applying the CMNL model [31], where the individuals impose thresholds to the attributes directly in the utility function, by penalty specifications in the binomial logit form to greatly reduce those products' utility that do not meet any constraints. [31] assumed that deterministic utility can be separated into a compensating term (V s,k i ) and a non-compensating term (ln φ s,k i ), indicating the feasibility of product i for s: where ln(φ s,k i ) is a cut-off or penalty function imposed by the consumer segment s on the price of product i. This penalty by means of a logarithm function achieves a smooth transition between the compensatory feasible space (p k i < wp s ) and the infeasible noncompensatory space (p k i > wp s ), allowing budget constraints to be subtly breached by the consumer.
Finally, the probability of a consumer in segment s ∈ S choosing the product i ∈ N k of firm k ∈ K, q s,k i with cut-off functions is In particular, in (6), applying the CMNL approach, we model a buying probability φ s,k i that penalizes the consumer's utility, which depends on the price p k i of the product i of the firm k.
where σ s is the inverse of the variance of consumers concerning the reservation price. The parameter τ s in (8) is inversely proportional to σ s , and depends on η s that is the proportion of the population that violates the price constraint. Considering that φ s,k i ∈ [0, 1], then each time the prices surpass the wp, φ s,k i will lower the choice probability for that product. At the limit, if wp 0, φ s,k i = 1, and if p = wp, φ s,k i = η s . An alternative is to understand φ s,k i as soft bounds that replace the constraints that impose the price as being strictly lower than or equal to the willingness to pay (p k i ≤ wp s ), which precludes considering prices above the willingness to pay. On the other hand, through φ s,k i , one can consider prices slightly higher than the willingness to pay. The cutoffs emulate the hard constraint through their 's' shape. The willingness to pay splits the cutoff, the higher values of φ s,k i are when prices are lower than wp, and φ s,k i decreases as the price increases. Furthermore, the higher the sigma value, the more negative the slope of the 's'. The cutoff functions act on each exponential term by lowering the relative relevance of the alternative as the price surpasses the willingness to pay. Let us notice that it does not eliminate the option when the price is 'slightly' above the willingness to pay, but the probability of choosing that bundle is lower than in the absence of the cutoff.
When the prices do not surpass the willingness to pay, the cutoffs values are almost one; also, when the modeler does not consider the cutoffs, it is equivalent to φ s,k i = 1. The probabilities take the same (or approximately) form as the classic multinomial logit in both cases.
When Martínez et al. [31] proposed constrained MNL (CMNL), they explained the model construction and its interpretation for endogenous and exogenous constraints. They indicated theoretical insights regarding parameter estimation but did not show details. On the other hand, Castro et al. [68] described the estimation method specialized in CMNL; through the maximum likelihood method, they calculated the first-order optimality conditions and discussed the possibility of identifying the cutoffs' parameters for the possible two cases: endogenous and exogenous, indicated by Martínez et al. [31]. We explain both in the following paragraphs. The numerators in the expression are the relevant parts of the probability for estimating the parameters of the consumers' utility. In the MNL context, it is equivalent to consider that the cutoffs are equal to one, so the numerator is e V s,k i . In a CMNL, the numerator includes the cutoff and is φ s,k i e V s,k i . We can recognize a modified deterministic utility for the CMNL case as follows: • The MNL deterministic term for the consumers' utility is • The cutoff and the term τ in the CMNL version are already defined. • One can recognizeṼ s,k i as the equivalent consumer utility in the CMNL model: The last term in the logarithm is quite relevant because it is added to a traditional MNL. The modeler needs to estimate more parameters, as usual, through a maximum likelihood scheme. Considering the mathematical properties of likelihood functions, the estimators obtained under general assumptions are consistent, asymptotically normal, and efficient.
The modeler can classify the bound of cutoff (in our case, this bound is the willingness to pay) as exogenous or known by the modeler, or endogenous, meaning that the bound is a new parameter to estimate. Please note that our research is independent of this aspect, which means the bound could be exogenous or endogenous. If the modeler has the data, then she/he can estimate the CMNL and then calculate prices in competition with our approach. Interested modelers can follow guidelines from [31,68], estimate the parameters of a CMNL, and then use our methodology to estimate prices.
When the cutoff is exogenous, the bound is known by the modeler. One can calculate the proportion η from the data (in general, from surveys or data provided for a company) not accomplishing the willingness to pay (bundles bought by consumers with a price higher than the wp) and obtain τ = 1 ω ln To do this, we need ω. Hence, the modeler must estimate the parameters of a traditional MNL, and the only additional required CMNL parameter is ω. In an endogenous approach, the modeler does not explicitly know the bound. Therefore, one requires estimating the traditional MNL parameters and three additional sets of parameters specific for CMNL, in this case, the willingness to pay, τ, and ω [68].
Regarding the possibility of identifying the base attractiveness of a firm, it is not possible to estimate it but within a point of reference, fixing one of the values to one, for example, because we are only interested in the difference between firms. Consumers prefer some companies, although the offered bundle is the same in terms of its composition and price. It could be due to the consumers' perception of quality or personal preferences the modeler cannot understand. Nevertheless, it is impossible to estimate this parameter without a reference point unless considering a reference value. On the other hand, the attractiveness of the services within the bundles (telecom services) will depend on if the service is included in the bundle and the 'offered quantity', in this case, velocity for internet services, minutes in telephony, and channel availability in TV [63].
Interested readers will be able to find more details on σ s , τ s and, in general, the CMNL formulation and cutoffs in Martínez et al. [31], Castro et al. [68], and Pérez et al. [15].

Mathematical Model
Consider a firm k ∈ Ψ selling i ∈ N k products at a price p k i , then the cost is c k i and depends on the attributes that give the product its total attractiveness I k i . P k is the profit maximization problem faced by a firm k ∈ Ψ: where Π k is the firm's total profit, H s is the number of consumers in segment s ∈ {1, ..., S}, q s,k j is the price-dependent probability of a consumer in segment s who buys product j ∈ N k (6), and p is the price vector of all products of all firms.

Equivalent Nonlinear Problem and Our PSO Approach
We first calculate the optimality first-order conditions for each firm maximization problem P k , and then we propose a PSO approach, in which we solve an equivalent non-linear optimization problem approach to determine the prices that accomplishes an equilibrium.
The derivative of the profit of the firm k ∈ Ψ to the product price p k i , with i ∈ N k takes the following form: We define g s,k i in (13), and show the derivative for the market share of q s,k i in (14). See Appendix A for details on determining Equation (14).
When every firm k ∈ Ψ, ∀i ∈ N k reaches the optimality conditions, there is an equilibrium for the game defined by (11), and it is our aim to find the vector of prices accomplishing these conditions. We update the expression for ∂Π k / ∂p k i = 0 in Equation (15).
We write this expression as a fixed point on prices in (16).
Note that when there is only one segment, and for i = j, the derivative is always positive, meaning that increasing the price of product i will increase all the other market shares. When i = j, we have that a price increasing logically reduces the market share of product i. If we impose the first-order optimality conditions, we have This particular case in Equation (17) corresponds to a nonlinear equation system in terms of products' prices. Note that β < 0, φ k i ≤ 1, q k i ≤ 1, and if we assume that there are no cross-subsidies among firm's products, then p k j ≥ c k j , ∀k ∈ Ψ, j = i and j ∈ N k , and p k i ≥ c k i . This simplification of our model is an extension of the case with one product, one segment and without competition (one firm) reported in [15], where . Our general expression in (16) extends the case reported in [21] by including competition, and wp with the CMNL.
Any price vector p accomplishing p k i = f k i (p) is an equilibrium for the game defined by (11). Note that even a simplified version of our problem without competition (one firm) may have a non-convex profit function on p (for details see Section 3.1). Then, to deal with our problem, in (18), we propose an equivalent nonlinear optimization problem on p to find an equilibrium of (11). A price vector p * is an equilibrium for (11) when the objective function z for (18) is zero.
Pérez et al. [15] applied a bisection approach to solve the pricing and composition problem; therefore, they may be able to handle the problem to optimality because they considered a simplified version of the problem, with a concave profit. Li et al. [21] did not consider competition, nor consumers' willingness to pay; they proposed bisection search and gradient descent to solve the problem. We propose a heuristic approach based on PSO [27,72] to solve (18), and hence, finding an equilibrium for (11).
Considering that we are proposing a heuristic, we can only check if the solution for (18) is zero, equivalent to fulfilling the first-order conditions of optimality for (11). We run various PSO trials, implementing multiple combinations of hyper-parameters and different swarm sizes; this allows us to find one vector p accomplishing the nonlinear system of equations in (17).
To solve (18), let us define p w price vectors, where w ∈ W is the swarm individual, v m w is the velocity in the iteration m of the PSO algorithm, c 0 is the velocity inertia weight [73], and c 1 and c 2 are constants associated with the particle best vector pbest m w for w and general best gbest m [74]. Finally, we have the random parameters r 1 and r 2 , both ∈ [0, 1]. Following PSO, in Equations (19) and (20), we update the price vectors p w through velocity v m+1 w balancing among v m w , pbest m w and gbest m . The algorithm stops when n consecutive solutions have differences smaller than the threshold > 0.
It is relevant to point out that when the optimal value of the objective function in (18) is zero, the optimal price vector accomplishes the first-order optimality conditions. Because for every pair (i, k) the term of the sum is zero, every equation of the set of optimality conditions for the problem (11) holds. Nevertheless, not every price vector accomplishing the first-order optimality conditions will be an equilibrium. To improve the search for equilibria (although we cannot ensure finding one), we include additional requirements for the PSO heuristic, which reports whether the Hessian of the problem in the expression (18) evaluated in the price vector found is negative definite or not.
The procedure is: • If PSO finds a solution such that z * < (the modeler defines as a positive small number), then the program reports the best solution (the smaller one) and indicates that this solution does not accomplish the first-order optimality conditions. • If PSO finds one or more solution vectors, then we evaluate the second-order conditions in these price points as well as the following: -We report this vector as the solution if any price vector has a negative-definite Hessian.

-
If no vector has a negative-definite Hessian, the methodology reports that there was no equilibrium and the found solutions.
Our results indicate that PSO is quite efficient in finding price vectors fulfilling (16). We show details on PSO convergence; see Appendix C.

Results
This section contains three analyses and numerical experimentation subsections dedicated to this research's key concepts we would like to highlight. In the first subsection, we analyze the profit function in the presence of multiple consumer segments. We study the prices in equilibrium in the presence of wp in the second subsection. The third subsection describes a case based on the data of a Chilean telecommunications firm.

Profit Function with Multiple Segments
In this subsection, we make explicit the non-concavity of the profit function. Let us consider a stage without competition; in Figure 1, we present two cases in which we plot the profit depending on the price. This situation represents a firm offering one product for three consumer's segments, each having a quasi-concave profit shape. In the case of Figure 1a, the total profit is nonconcave, and in Figure 1b, it is quasi-concave. In Figure 1, for both cases, the cost per product is 1, the price varies in {1, ..., 25}, the base attractiveness is 2, and γ = 5. There are differences in segment sizes H and price sensitivities β. We show the details of H and β in Table 1. From these results, it is clear that in simple cases without competition and with few differences in parameters, one can obtain non-concave total profit curves. In this context, an approach based on heuristics, particularly PSO, is a great alternative for dealing with this highly nonlinear problem. Table 1. Segments' sizes H and β for the non-concave and quasi-concave total profit cases, and segments S1, S2 and S3.

Non-concave
Quasi-concave S1 S2 S3 S1 S2 S3 Let us consider, now, one firm (no competition), two segments, and two products. One is able to also visualize the non-concavity of the profit function. Let us observe Figures 2 and 3, where we can see the shapes for the firm's total profit and two segments. We consider prices of products in {2, ..., 26}; the attractiveness of Product 1 is 2, and for Product 2, 2.5. As in the two-dimensional cases in Figure 1, we show in Figure 2a nonconcave total profit shape, while in Figure 3, it is concave. The parameters for each case are in Table 2. Table 2. Parameters for profit profiles in Figure 2. These last cases with two segments and two products consider only one firm, and hence, no competition. We also do not analyze wp, which means that even with the simplest case, the problem is a nonconcave maximization problem. Note that even this simplest version of our problem considers more phenomena than those without competition or consumers' segments, such as [5,15,16,42,75]. The purpose of these cases, in which we showed the profit function by extension, was to show the complexity inherent to the profit's non-concavity.
Observing Figures 2 and 3, it seems that the size of the segment and β are the main drivers that define the maximum profit. In competition, we expect to see some firms specializing their offers to certain more profitable segments, and we are yet to see the effect of wp. We analyze these aspects in the following Sections 3.2 and 3.3.

Pricing and Willingness to Pay
Let us analyze the effect of competition and wp by considering two cases, both with two firms: one consumer's segment, one of them without wp and the other with. We analyze these cases to quantify the effect of competition and wp on prices in equilibria. We will see the relevance of wp, and, although it is more difficult to estimate a CMNL if the evidence shows that there are constraints on the wp, it is recommendable to include them into the analysis. We will give some insights and cite well-established methodologies to estimate the additional parameters required to represent the wp as a soft constraint within an MNL.
We consider then two cases: one with wp >> 0, and the other with wp > 0. The general parameters are in Table 3, and the wp along with the product and firms' base attractiveness in Table 4. Note that wp >> 0 means that, although we present values for γ and σ, in this particular case, the cutoff (7) is approximately 1, or in other words, the model is an MNL instead of a CMNL. Table 3. General parameters: two competing firms-one consumer segment.

Parameter
Value The product costs are in Table 5; considering the attractiveness presented in Table 4, we have Firm 1 with less base and product attractiveness than Firm 2, but Firm 2 incurs higher production cost than Firm 1.
As we can see in Tables 5 and 6, the model is able to handle the limits imposed by the wp, the firms' prices are higher when wp >> 0, and the respective market shares are lower. These results are obvious and expected, and therefore, we point out that our methodology can quantify this effect.  The last and more important result, presented in Table 7, is that for MNL (wp >> 0), both firms' profit are higher than CMNL (wp > 0). With CMNL, the modeler needs to estimate more parameters than MNL. However, this raises some questions: What does it mean? What is the advantage of including the wp?.
The meaning of this occurrence relies on the data and depends on the modeler to include it, who requires information about consumers' preferences. This gathering process may come from specialized surveys or firms' historical information (for example, stored on firms' CRMs databases), that is, the wp and its corresponding cutoff would exist not because the modeler decides so, but due to the data containing the consumers' preferences indicating that those constraints (cutoffs) exist.
The most common approach is estimating a plain MNL model, disregarding phenomena other than the consumers' choice. The methodology to estimate the MNL parameters is to apply the information regarding consumers' preferences; with this information and implementing a maximum-likelihood approach, it is possible to estimate a standard MNL. Nevertheless, the information may indicate that there are constraints regarding wp. If that is the case, the model turns into a CMNL, and the methodology is slightly different. In particular, the probability q s,k i in Equation (6) now includes the soft constraint represented by φ s,k i . Fortunately, there is a well-established way to analyze this case, and, as described by Martínez et al. [31] and Castro et al. [68], one can formulate the maximum-likelihood problem for this case (CMNL) to estimate the parameters. If the profit was higher without wp, why would the modeler have to include it? To answer this question, let us take into account how the firms decide to consider the prices in equilibria when using MNL (wp >> 0), but the consumers' data indicate the presence of wp. When this occurs, the firms' profit is even lower than when considering CMNL; see column MNL* in Table 7. This means that the constraint has to be considered; if the data suggest a constraint related to the consumers' maximum willingness to pay, ignoring this fact will lead to even lower profit for firms. A worse scenario is when one of the firms' bets identify the constraint but not the other. We summarize these results in Table 8, where we present these cases; as we can see, these results reflect that the best alternative for firms is not only to analyze competition, but also to study whether the wp is present or not and if this is the case, then it is recommendable (required) to include it. Table 8 presents the firms' profit in parentheses; the first term refers to Firm 1 and the second to Firm 2.

Pricing Telecommunications Bundles
Let us consider a case inspired by the Chilean telecommunications market. A Chilean firm was interested in this research as a tool to price their plans, allowing us to use some of its data. The main objective of this numerical experience is to make explicit some of the findings of this research. Specifically, we apply this framework to price telecommunications bundles, analyze the importance of competition and the consumers' willingness to pay in the equilibrium. It is important to make clear that the real market is more complex than this numerical experiment. In particular, there are more competing firms; we focused on those having direct competition in three specific segments. Thus, there will be some approximations and phenomena that cannot be analyzed. Nevertheless, some of them have given us clues to continue with this work, and we indicate them to the reader as further research.
Historically, the Chilean telecommunications market has had three main (both mobile and fixed) network operators, two incumbents with most of the market share, and one challenging firm in the third place. With the 2.1 GHz spectrum auction (in 2013), two new entrants appeared in the market; one of them (the most successful) raised (by the end of the first half of 2018) approximately up to 15% of the total market share. Thus, there are now five network operators (with both mobile and fixed networks) that, combined, have more than 95% of the overall market share, and the remnant share belongs to mobile virtual network operators (M) and small fixed operators. For this experiment and the sake of simplicity, we focus only on two of the network operators. Let us call them Firm 1 and Firm 2, assuming, as an approximation, that the other firms will be included in the experiment through the parameter γ; that is, there will be a pricing game between Firms 1 and 2, that considers information gathered from public data, and one of the firms.
We consider three consumer segments with differences in sensitivity to price β s , willingness to pay wp s , size H s , base firms' attractiveness, and preferences for the services included in the plans offered by firms. Each firm offers four bundles that include one or more of the following services: mobile access (internet+voice), fixed internet, and TV. In the last two cases, alternatively, cable or optic fiber is used, depending on the firm. Firm 1 has cable as the fixed last-mile technology, and Firm 2 uses optic fibers.
In Table 9, we show the assortment, which in the first column indicates the firm and bundle numbers. From the second to fourth columns, we expose the services offered, where there is a number that indicates the service is offered, and the number represents the variable cost per user per month in USD for firms. In all cases, we consider the services as the offers in telecommunications plans. We include (column 2) mobile internet, with voice minutes, internet GBytes, and SMS, which are unlimited in all the cases, (column 3) fixed internet, and two TV services, one by (column 4) cable and the other by (column 5) fiber. We show the firm's total cost per bundle per client per month in USD in the sixth column.
Note that the assortment in Table 9 represents the offer in the segments the firms call residential. Firms also offer other bundles, and they are open to be bought by any person. Then, we focus on three residential segments, S1, S2, and S3.
Note that we have only referred to the total attractiveness of a bundle in segment s as I s,k i , and to the cost of a bundle as c k i . These parameters depend on the composition of the bundle. Let us define the composition for a bundle i as the attributes x k ia ∈ {1, 0} with a ∈ A i that must be understood as the bundle's characteristics that are valuable for the consumers. As it was considered in Bitrán and Ferrer [5] and in Pérez et al. [15], we will assume that I s,k i and c k i are linear functions on x ia , then, we are able to define them respectively by means of the individual attribute attractiveness I s,k ia , and the cost of an attribute c k ia in (21), as well as (22). In this example, the bundles' attributes are the services included within the plans.  Firm 1, which provides bundles B1 to B4, offers mobile services (voice, Internet, and SMS) over a radio access network with national coverage, while fixed internet and TV services are offered over an old mixed coaxial and ADSL (copper) cable network. Firm 2 has no mobile network, since it relies on a mobile virtual network operator (MVNO) contracts with network operator with national coverage. It provides services over a fiberto-the-home access network. F2 offers better quality on fixed services and can compete in mobiles as well, but its costs are higher because its fixed network is newer (in deployment), and in mobile, it acts as MVNO. Despite this fact, as it can be seen in Table 10, the base attractiveness of Firm 2 is higher than Firm 1.
In Table 11, we show segment sizes H, the services attractiveness, and price sensitivities β, for two cases, CMNL and MNL (or equivalently with and without cutoffs). Note that the sum of the sizes H is 100; therefore, we represent the relative relevance of each segment. The attractiveness of the services multiplied by the composition allows us to estimate each bundle's attractiveness as we show in Table 10. Price sensitivities β < 0 represent the negative effect induced by price on the consumers' utility.
In this case, the telecommunications services are part or not of the bundle, but in practice, the services included in the plan (composition) could have different qualities, for example, internet velocity, minutes, number of SMS, and channels; these differences and information regarding the consumers' choices and the assortment of the companies are the base to estimate the composition of bundles and attractiveness of each service in the plan [63]. Table 10 shows the consumers' maximum willingness to pay per segment wp, the utility of the other alternatives available in the market γ, the firms' base attractiveness, and the bundles' total attractiveness. We show the two cases, one with CMNL and the other with MNL. It is also relevant to point out that price vectors, CMNL and MNL, are equilibria only with their corresponding attractiveness and beta values.
We show the results for the MNL case (without considering wp) in Table 12, and CMNL case, including wp cutoffs, in Table 13. F2 does not have good coverage in the mobile network since it uses national roaming to provide service and is deploying a fixed fiber-tothe-home network. The result of this fact is that F2 has higher costs of providing services than F1 has. Nevertheless, the perception of consumers (attractiveness) is more favorable to F2 than to F1. In the beginning, when F1 preliminarily analyzed the assortment and pricing without wp soft constraints and considering the preexisting F1's offer, something similar to the results in Table 12 occurred. F2 was planning to focus on segment S3, offering highquality bundles at a higher price. Nevertheless, according to our analysis, with competition and considering wp, they realized that they could not only compete and win on segment S3, but also gain a share on segment S2. They are still not competitive in mobile service, something one can expect, because they do not have a mobile network. They also realized that consumers are not willing to pay a large sum of money for services they are used to, and that a good strategy was to lower the prices, despite the higher costs.
It is relevant to point out that this is an approximated Chilean firm setting (F2). Most of the previously mentioned variables are public information, available from the Chilean National Regulatory Agency of Telecommunications (Subtel: Subsecretaría de Telecomunicaciones de Chile). We obtained the parameters required for this particular experiment (specifically costs and consumer characteristics) from a Chilean operator. We applied information gathered from Subtel's website to estimate the competing operator's costs and the base information of the market (https://www.subtel.gob.cl/estudios-yestadisticas/ (accessed on 1 Auguest 2022)). For reproducibility purposes, it is relevant to point out that we considered costs and prices in Chilean pesos in this section's numerical trials, but, for readability, we presented the results in USD. Each USD is equivalent to approximately 750 Chilean pesos. Hence, please consider this transformation to obtain the consumers' utility functions and the firms' profits.
In Appendix B, we developed two additional sets of numerical instances. The first is to quantify the effects of wp on prices, and the second to show the relevance of cost of attributes in optimal prices. Although we already included these effects in the case that we analyze in this section, we provide more numerical instances to show how the prices change depending on various magnitudes of wp and costs of attributes.

Conclusions
We analyze price competition among firms selling products to multiple segments and include consumers' willingness to pay (wp) through CMNL. Our model is capable of estimating prices in equilibrium for a market with all the indicated conditions together. We show how these aspects turn into competitive advantages against competitors that do not consider them.
We propose an equivalent nonlinear optimization model that allows us to find prices in equilibrium. We construct an unconstrained nonlinear optimization problem by adding the absolute values of optimality first-order conditions of each firm's profit maximization problems. When the problem's optimal value is zero, then the first-order conditions hold, and if the Hessian is negative definite, then the price vector is an equilibrium for the game. Considering that the model is nonlinear and nonconvex, we apply a heuristic approach based on PSO to find prices. The last point means that we cannot ensure an equilibrium, but the results will indicate when we find one.
We show that the analysis and modeling of the consumers' willingness to pay wp are key for firms, and there is a competitive advantage when considered in pricing duties. Despite there being greater difficulty in estimating the parameters of a CMNL than those of an NML, it is highly recommended (perhaps required) to include wp soft constraints as a CMNL in consideration of the benefits that the firm can obtain and the losses of not doing so. It is feasible to estimate CMNL parameters through well-defined methodologies as proposed by Martínez et al. [31] and Castro et al. [68]. Although it is out of the reach of the research, we want to point out that it is possible to characterize the more detailed attractiveness or estimate compositions when there are two or more firms. The bundles have multiple attributes, and the modeler can quantify the attributes, such as quality or monthly consumption [63].
The costs and product attractiveness, combined with competition and wp, lead to quite different pricing approaches, thus, all factors together lead to completely different results than those that do not consider them. Not only are the profit estimations closer to reality, but also it helps firms to focus on segments that are more profitable and play better against competing firms.
The inclusion of multiple segments, despite its relevance, induce non-concavity to our model. We propose a PSO approach to handle this complex problem. PSO shows a positive behavior in terms that it is capable of finding equilibria (z * = 0), and the convergence is fast using computers with affordable technical specs.
In conclusion, it is relevant to model competition, wp constraints, and multiple consumers' segments for optimal pricing, despite its difficulties. A firm that makes an effort to include these aspects in its pricing decisions may have relevant advantages over its competitors.

Discussion
We include this subsection to enrich this paper with an overview of the written discussion we had with the reviewers, who we thank for their valuable comments. We illustrate this discussion with the following questions and answers.

1.
Why are we doing this work?
We did this research to put together two main aspects affecting pricing decisions; competition and segmented consumer preferences. Now we know how much they jointly affect the price and offer the readers a heuristic approach to quantify this effect. The empirical motivation lies in the telecommunications example we described. The history behind the idea is that a newcomer company contacted us; they had technical-cost efficiency differences compared to the established operators. The consumers were aware of the differences in coverage, so the tradeoffs with price was a key for the profit. We aimed to recommend a pricing strategy in this competitive environment. We believe that the contribution of the research is quantifying the effect on prices in this competitive environment by explicitly considering consumers' preferences and by providing a methodology, despite being approximate, is viable to be implemented with the data practitioners have.

2.
What is our model better at? What are the advantages and disadvantages of using it?
To the best of our knowledge, we are the first to look at the multiple firms competing across different consumer segments with a CMNL demand. We show the relevance of considering consumers' willingness to pay in a competitive pricing environment with multiple consumer segments. Although we cannot ensure finding an equilibrium, the results indicate whether if we found one or not.
Although it is more difficult to estimate the parameters of a CMNL than of an NML, we still recommend including the cutoffs because of the benefits the firm can obtain by doing so and the disadvantages of not doing so. Let us notice that Castro et al. (2013) demonstrated that if there are cutoff effects, CMNL represents better the consumers' preferences. Additionally, it is feasible to estimate CMNL parameters through well-defined methodologies proposed by Martínez et al. (2009) and Castro et al. (2013).
In the absence of cutoffs, CMNL reproduces MNL, which means that cutoffs are estimated to be 1. The costs and product attractiveness, combined with competition and the wp, lead to different pricing approaches; thus, all factors lead to entirely different results than those that do not consider them. The inclusion of multiple segments, despite its relevance, induces non-concavity to our model. We propose a PSO approach to handle this complex problem. PSO shows positive behavior in terms of being capable of finding solutions (z * near zero), and the convergence is fast using computers with affordable technical specs.

3.
How are we dealing with uncertainty? Is the entire procedure reliable?
For a known dataset, all the parameters are uniquely defined, both from the logit model and costs. Therefore, our problem is framed within deterministic optimization.
Statistically, a modeler with data can estimate the utility function parameters. Within a known database and using maximum likelihood estimation, they (the modelers) can find consistent, asymptotically normal and efficient parameters. These properties allow the modeler also to calculate the t-test on each parameter, the F test in the case of multiple conditions on the parameters, and confidence intervals. Additionally, each confidence interval allows calculating a range for the parameters estimated. As ongoing work, we are dealing with the robustness of the solutions by introducing conditions for the potential variability within confidence intervals; we will be trying sensitivity analysis, robust, fuzzy or stochastic optimization tools.

Future Work
From the illustrative example of the Chilean mobile telecommunications market, it was also possible to approximate some of the effects that are already occurring in this market, and some further research topics have surfaced naturally from these experiments: Acknowledgments: The authors would like to thank the students Nicolás Ulloa and Matías Rivera for their valuable work on this project, and Luis Aránguiz from Claro Chile S.A. for his comments, which were quite useful to make the model more realistic and improve its usability in real-world problems.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Derivatives Explanation
We defined g s,k i in (13) and derivative for the market share of q s,k i in (14). We now give details on determining this last equation and help the reading process to flow easily. For explanatory purposes, we replicate Equation (14) in the following Equation (A1).
Let us begin with  If we analyze Equation (17), and we assume that there is no limit in the wp, the fixed point expression is modified and corresponds to an extension in explicit price competition of [5]. Specifically, we have that as wp increases, then φ k i increases and the probability . This expression is the probability for a classic discrete multinomial logit model, and leads to a different equilibrium than in (17), whose solutions are a non-linear fixed-point equation system in (A2).
Comparing Equations (17) and (A2), one can see that there are differences when including wp in the model with competition for multiple product prices. Let us now make the effect of wp clearer through a fictional case with two firms with the same costs of attributes, and one identical product offered per firm. As wp increases, product prices and the market share in equilibrium will also do so. In Figure A1, we show this effect, and it is possible to observe that the proposed model allows us to explicitly find optimal price levels in competition in the presence of wp. We must mention that the results were obtained using PSO, running on an AMD Ryzen 9 series 4000 CPU with 40 GB of RAM, 2TB SSD. In this set of experiments designed to analyze the effect of consumers' maximum willingness to pay, the resulting prices at equilibrium were the same when varying the initial population of the PSO.

Appendix B.2. Costs of Attributes
Note that we have already shown the effect of whether considering the wp or not. Let us now focus on the effect of the cost and consider consumers with β < 0, and attractiveness of attributes (or attributes' valuations) I k ia . It is expected that if consumers have a higher valuation for a specific attribute a * and, if a firm k * has a lower cost c k * ia * , then, the firm k * will have advantages over it competitors.
The effect mentioned above is produced when there are differences in the costs of attributes, which can be explicitly shown with our model. Let us set a stage with two firms, each offering products with the same two attributes, each with equal attractiveness; the difference will be on the cost of producing the attributes faced by Firm 1. We show prices, market shares, and profits in equilibrium when varying the costs of the attributes of Firm 1. The stage set before will be shown for two cases, one with a wp 0, and the other with wp near to zero. Figures A2-A7 show prices at equilibrium for Firms 1 and 2 with respect to the cost of attribute 1 for Firm 1, for three values of the cost of attribute 2 for Firm 1 (C2F1 = 0.1, 1.0 and 2.0 ). The costs of attributes 1 and 2 for Firm 2 remain constant. From these results, it is possible to observe that, for both cases with wp 0 and wp ≈ 0, when the cost of the attribute 2 of Firm 1 is greater than the cost of the attribute 2 of Firm 2, then, the advantage in terms of profit goes to Firm 2 and consistently is in the opposite way when the cost of attribute 2 of Firm 1 is lower than the cost of attribute 2 of Firm 2.
Having the same set of figures indicated in the paragraph above and the effect of cost differentiation in mind, let us note that the consideration of wp indicates that the effect of cost differentiation is less intense than when wp 0, which means that if the firms do not consider the wp when defining prices under competition, they will overestimate the effects of any differences in costs with respect to the competing firms.

Appendix C. PSO Convergence
In this appendix, we show how the PSO behaves in terms of convergence, and explain the way we applied this heuristic to the equivalent model in (18) we proposed.
The PSO heuristic is quite efficient to find prices in equilibria. In Figure A8, we show the optimal value of the best particle (Z Gbest ) for different and randomly distributed initial population sizes. In all cases, in fewer than 70 iterations, the error was near zero. The base case in which we ran these trials is the one whose parameters are in Table 3. The cases in Figure A8 consider two groups: 1.
Group A includes the series Gbest 1-2000, Gbest 2-50,000, and Gbest 3-100,000: At each case, the initial population (swarm) size is the number at the end of the name, at each iteration the parameters c 1 and c 2 were chosen randomly among the values {0.5; 1.0; 1.5; 2.0; 2.5}, and at the value of w, randomly selected among {0.4; 0.6; 0.8; 0.9; 1.0}.

2.
Group B includes the series Gbest 4-2000, Gbest 5-50,000, and Gbest 6-100,000: At each case, the initial population (swarm) size is the number at the end of the name, and at each iteration the parameters c 1 = c 2 = 1 and w are linearly decreased from 1 to 0.6.
It is quite relevant to point out that to make a better search for the equilibria we performed the following: • Added multiple fresh starts for the PSO, and we stored solutions accomplishing z = 0 in (18). Each PSO trial has a different strategy for selecting hyper-parameters (c 0 , c 1 , c 2 ), swarm size |W|, and the initial swarm position is always random. • For stored previous step's solutions we evaluated, for each firm, whether the profit is a maximum. We numerically evaluated the Hessian of the function at the found price. • Finally, from all equilibria, we selected the one in which every firm is reaching a maximum.