On the Beneﬁts of Providing Timely Information in Ticket Queues with Balking and Calling Times

: We study a phenomenon causing server time loss in ticket queues with balking and calling time. A customer who balks from the queue after printing a ticket leaves a virtual entity in the queue that requires server time to be cleared. The longer the queue, the larger the proportion of customers abandoning their place, and the larger the server time loss due to calling customers that left the queue. The solution is suggested by giving the customer the best possible estimate of her expected waiting time before printing a ticket, thus ensuring that, if she balks, no number in the queue is created that will waste server time. Although partially observable ticket queues have been studied in the literature, the addition of a calling time for absent customers creates a new type of problem that has been observed in real life but has not been formally addressed yet. We analyze this stochastic system, formulate its steady state probabilities, and calculate the system’s performance measures. The analytical solution provided here is robust and can be applied to a wide range of customers’ behavior functions. Finally, numerical analysis is performed that demonstrates the beneﬁts of providing timely information to customers for different levels of trafﬁc congestion.


Introduction
Waiting in a queue is one of the most ubiquitous human activities nowadays. We wait at the bank or at the pharmacy; we wait on the phone or in traffic at a crossing. As soon as we wait because others require the same service as us, we are "in the queue".
More often than not, a customer is free to leave the queue (abandonment) when it is too long (traffic may be a good counter-example). This decision is usually taken based on the perception of the expected waiting time. However, actually standing in a first come first served (FCFS) waiting line and being able to visually assess the queue length and its progression is becoming the exception. An extreme case is found in call centers that are simply unobservable to the customer. With a low joining cost (picking up the phone) and no other information on the queue status than what is provided by the server, leaving the queue is an inherent part of call center operations (and an important performance measure). Gans et al. [1] provide an extensive review on the subject.
Ticket queues as studied here offer only partial observability. Nonetheless, they present strong advantages and are quickly becoming the queue management norm. While in regular queues there are numerous ways that servers can optimize their utilization by using their non-service time to perform other tasks, see, e.g., [2][3][4][5][6][7][8][9][10][11][12][13][14][15], customers, on their side, have little to no flexibility in the way they use their waiting time. Ticket queues allow the handling of large numbers of customers in relatively small places. Customers are not required to stand, and the responsibility of keeping their place in the line is transferred to the server. They can use their waiting time to run some errands or in other more enjoyable ways than standing in line, such as enjoying a cup of coffee at a nearby location [16]. Ticket queues also hide FCFS violations (sometimes justified), thus lowering the customer perception of social injustice. All these aspects reduce customer stress, improve their overall satisfaction, and reduce their tendency to renege. Recent field studies show that customers prefer ticket queues over standing in line, especially in low traffic intensity settings (see [17]).
Besides their clear managerial advantages, ticket queues prompt new problems. Even when most customers are usually in the vicinity of the server, the expected waiting time can be difficult to assess. For the simplest ticket queues, a slip of paper is presented at the entrance of the waiting room with a number on it. The number is visible, and a customer can compare it to the one currently served as announced by a screen near the counter without actually taking the ticket. A customer, discouraged by what she estimates to be too long a wait, will balk without taking the ticket. In that simple case, a new customer has a good appraisal of the actual length of the queue by observing the number of people waiting, and the gap in ticket numbers is biased upward only by reneging customers.
In more complex queues, different classes of customers are dispatched to multiple servers, creating several parallel queues, all waiting in the same room. Tickets are printed upon request. The ticket dispenser is connected through a central dispatch unit to the display, and rules other than FCFS can be implemented without the outrage it would otherwise create (see [18]), for example, by mixing customers that made an appointment beforehand with the regular flow of walk-in customers. When entering the room, a customer requires a specific service from the automatic ticket dispenser, which then prints a ticket corresponding to the relevant queue. Only then can she evaluate the expected waiting time. In this case, balking creates gaps in the ticket queue, which make the ticket number gap unreliable as an estimator of the waiting time. Additionally, the fact that several queues are waiting in a common room makes it almost impossible to evaluate the queue length based on the apparent level of congestion. As shown in [19], the combination of the abandonment with the observability limitations sets ticket queues apart from regular queues. In models tackling this combination of features in the literature, customers systematically overestimate the actual queue length, which increases abandonment.
The problem we present here, illustrated in Figure 1, has been observed first in local post offices and pharmacies, in which a basic flaw yields server losses at the worst time possible: when the system is congested. A customer enters the office and is prompted by a machine to take a ticket for one of several queues running independently of one another. All the queues share a common waiting room, making it difficult to estimate the actual length of one specific queue. Once the ticket is printed, the customer can compare her number to the one currently being served. At this point, some decide to leave immediately based on their estimation of the waiting time (balking). The longer the queue (apparent or real), the higher the probability for a customer to leave. If she decides to leave before being served, the customer's ticket is thrown away, but a virtual customer has been created in the queue. We remark that, given known customer patience and a reasonably accurate waiting time estimate, customers knowing that their odds of reneging are high will not join the queue at all, and balking should become the preponderant abandonment mechanism.
On the server side, numbers are being advanced by the clerk and served one after another in an FCFS regime. The important point is that it takes some time after a number is called to determine that no one will answer the call and that the number represents a customer who has left. The absurdity in this system is that, at peak periods, when the queue is longer, a larger portion of customers leave. The solution lies in giving the customer the best possible estimate of her expected waiting time before printing a ticket, thus ensuring that, if she balks, no number in the queue is created that will waste server time. This will also decrease the chances of reneging because of a longer than expected waiting time.
Balking, partially observable queues, and queues with calling times have all been studied in the literature. However, combining these features together reveals a problem that has yet to be addressed. In this work, we propose a robust methodology that is able to solve any reasonably well-behaved balking function. On the server side, numbers are being advanced by the clerk and served one after another in an FCFS regime. The important point is that it takes some time after a number is called to determine that no one will answer the call and that the number represents a customer who has left. The absurdity in this system is that, at peak periods, when the queue is longer, a larger portion of customers leave. The solution lies in giving the customer the best possible estimate of her expected waiting time before printing a ticket, thus ensuring that, if she balks, no number in the queue is created that will waste server time. This will also decrease the chances of reneging because of a longer than expected waiting time.
Balking, partially observable queues, and queues with calling times have all been studied in the literature. However, combining these features together reveals a problem that has yet to be addressed. In this work, we propose a robust methodology that is able to solve any reasonably well-behaved balking function.

Literature Review
The present work studies a service system with two different classes of customersi.e., real versus virtual. In this context, bodies of research dealing with queues with heterogeneous customers are particularly relevant to us. Guo and Hassin [20], for example, investigate the join-or-balk decisions of customers with heterogeneous delay sensitivity in a queueing system with vacations, i.e., in which idle servers can leave the system to carry out ancillary work. Based on empirical data from a call center, Yu et al. [21] analyze the abandonment behavior of customers and demonstrate how waiting costs are impacted by delayed announcements. In their setup, customer heterogeneity is modelled through the ratio between cost (waiting) and reward (being served), which is sampled from a folded normal distribution. In Hu et al. [22] a unique server processes two classes of customers prone to balking. One class is informed about delays, while the other is not. The authors show how the system throughput and customers' welfare are directly impacted by the relative importance of the two classes in the system.
Klimenok et al. [23] analyze queueing systems in which customer priorities are dynamic. When they enter the queue, customers are assigned a low priority and a timer is switched on. When the timer expires, they either leave the system unserved with a certain probability or are assigned a higher priority. For further reading on heterogeneous customers in queuing systems, see [24][25][26][27][28][29]. In all these works, all the classes describe actual customers, while, in our work, one type of customer is not actually present in the system.

Literature Review
The present work studies a service system with two different classes of customersi.e., real versus virtual. In this context, bodies of research dealing with queues with heterogeneous customers are particularly relevant to us. Guo and Hassin [20], for example, investigate the join-or-balk decisions of customers with heterogeneous delay sensitivity in a queueing system with vacations, i.e., in which idle servers can leave the system to carry out ancillary work. Based on empirical data from a call center, Yu et al. [21] analyze the abandonment behavior of customers and demonstrate how waiting costs are impacted by delayed announcements. In their setup, customer heterogeneity is modelled through the ratio between cost (waiting) and reward (being served), which is sampled from a folded normal distribution. In Hu et al. [22] a unique server processes two classes of customers prone to balking. One class is informed about delays, while the other is not. The authors show how the system throughput and customers' welfare are directly impacted by the relative importance of the two classes in the system.
Klimenok et al. [23] analyze queueing systems in which customer priorities are dynamic. When they enter the queue, customers are assigned a low priority and a timer is switched on. When the timer expires, they either leave the system unserved with a certain probability or are assigned a higher priority. For further reading on heterogeneous customers in queuing systems, see [24][25][26][27][28][29]. In all these works, all the classes describe actual customers, while, in our work, one type of customer is not actually present in the system.
Another stream of literature related to our work addresses queues with balking customers. Sun et al. [30] study customers' equilibrium and socially optimal balking strategies in single-server Markovian queues with multiple vacations and N-policy. The authors observe that customers' individual behavior in stable equilibrium always congests the system more than the socially optimal one. Morozov et al. [31] consider a single-server retrial model with multiple customer classes. A new customer, facing a busy server upon his arrival, may join the corresponding (class-dependent) orbiting queue with a classdependent probability or leave the system forever (balking). An extensive simulation analysis allows insights on the model stability and performance. Zirem et al. [32] present a batch arrival queue with general retrial time, breakdowns, repairs, and reserved time. Upon arrival, when a batch of customers finds the server free, one of the customers from the batch begins service and the rest of them join into a so-called server orbit. Otherwise, the customers either balk or enter said orbit on an FCFS retrial basis. The authors conduct an analysis of their model through the use of the supplementary variables technique. Ke et al. [33] formalize an M/M/c balking retrial queue with vacation and study both single and multiple vacation policies. The optimization of the system is obtained through various methods, such as Quasi-Newton, Nelder-Mead simplex, and simulated annealing. For further discussion on balking customers, see [34][35][36][37][38][39]. Contrary to all these research efforts, in our setup, the system is not aware of the customer's balking.
Finally, unobservable queues have also been the subject of numerous studies in the literature. Haviv and Oz [40] consider an unobservable M/M/1 queue where customers are homogeneous with respect to both the service value and waiting time cost. The authors present a classification of the regulation schemes under which the resulting equilibrium joining rate coincides with the socially optimal one. Yu et al. [41] study the equilibrium threshold balking strategies for the unobservable single-server queues with server breakdowns and delayed repairs. Equilibrium mixed strategies are derived for both the partially observable and the unobservable queues. Lingenbrink and Lyer [42] demonstrate how, in a fixed-price service setup, a threshold-based (join/leave) partial information sharing policy can be optimal to the system. Kim and Kim [43] consider an unobservable queueing system with strategic customers. The authors show that customers who decide based on social welfare optimization arrive at a higher rate than those who decide based on profit maximization. Consequently, the admission fee required by the latter type of customers is higher than that required for the former. For further discussion on unobservable queues, see [44][45][46][47][48][49].
Xu et al. [50] consider balking customers with a threshold-type balking function while neglecting the clearance (calling) time for virtual customers. Through an approximation procedure, Ding et al. [51] show that communicating the current queue length to joining customers allows the reduction of the percentage of reneging customers. Numerous other papers on strategic queues assume that full information is provided to customers, who then decide to either balk or join the queue and wait for service (see e.g., [52][53][54][55]). By striving for a more realistic modelling of customers' behavior, Kuzu et al. [56] show that ticket queues are more efficient than formerly predicted in the literature. For further research on abandonments in ticket queues, see [57].
In the present work, we address the same problem for different levels of workload, with a special interest in overloaded cases where the stability of the queue is obtained only due to customers leaving the system. We study the value of providing timely information to customers and thus preventing the creation of tickets for customers who decide to leave. The damages shown by our study are, in some cases, considerable and fully justify the efforts by researchers to reach accurate models for abandonment in overloaded, partially observable queues and by practitioners to limit the waste related to calling absent customers as much as possible. We demonstrate the aforementioned phenomenon on a simple model according to which customers arrive in a ticket queue, receive a ticket on which their number in line is provided, and then decide to either stay in line or balk. This case is hereafter referred to as the "post office model", operating under the late information policy (LIP). The proposed solution is to inform customers of their number in line prior to printing a ticket, which is hereafter referred to as the early information policy (EIP). Our main objective is to study a realistic representation of the problem at hand, measure the damages caused by clearing customers who have left the system, and try to correlate these damages with the system characteristics.
The outline of the paper is as follows: Section 2 presents the analysis of the LIP model, including the exact model formulation and calculation of steady state probabilities and performance measures. In Section 3, the EIP model is derived. Section 4 provides a numerical comparison between the LIP and EIP models.

Mathematical Modelling
A single server is assigned to customers who follow a Poisson arrival process with the rate λ. The customer queue is unobservable, and the server calls and serves customers following the order that the tickets are issued upon their arrival in an FCFS regime. Upon arrival, a customer draws a number from a ticket machine, observes the displayed running number of the current customer being served, and, based on the difference between these two numbers, decides to either join the queue or balk. The difference between the two numbers is called the queue length. Since a customer is informed of the current queue length only after her ticket is issued, a balking customer leaves a trace in the system, one that will be dispatched to the server and that we call a virtual customer. When a ticket number is called, the server either serves the corresponding customer if this one did not balk (real customer) or spends a certain amount of time waiting for a customer before acknowledging that the ticket number represents a customer who balked (virtual customer). Both the service and calling times are assumed to follow an exponential distribution. The calling rate for virtual customers and the service rate for real customers are denoted µ and σ, respectively (µ > σ).
Each arriving customer who sees q customers in the system acts as follows: (i) she enters the system if the number of customers in the system is less than or equal to the pre-specified value q 1 ; (ii) balks if the number of customers is larger than q 2 (q 1 < q 2 ); and (iii) in the in-between case, the customer enters the system with the probability decreasing based on the queue length. We denote by ψ(q) the probability for an arriving customer to balk when the queue length is q. Accordingly, ψ(q) = 0 for q ≤ q 1 , and ψ(q) = 1 for q ≥ q 2 .
At any moment in time, we denote by r the number of contiguous real customers at the head of the queue (including the customer who gets service) and by v the number of contiguous virtual customers immediately following those r real customers. In order to analyze the system in a steady state, we define the system's state by a (k + 2)-dimension vector (r, v, c 1 , c 2 , . . . , c k ). The value of c i , i = 1, 2, . . . , k − 1 indicates the number of virtual customers standing between the ith and (i + 1)th real customers (after the first r + v customers), and c k denotes the number of virtual customers standing after the kth real customer; k is, therefore, the number of real customers standing after the aforementioned r + v customers. To illustrate the above definition, let us consider a few examples in which R and V stand for one real/virtual customer. In our notation, state (3, 2, 2, 0, 0, 2) stands for the queue composition RRRVVRVVRRRVV, state (0, 2, 1, 3, 1) stands for VVRVRVVVRV, and state (1, 0, 2, 0) stands for RRVVR. Consequently, the steady-state probabilities are defined as p r,v,c 1 ,c 2 ,...,c k .
Each level is specified by the value of its state's last component, either v or c k , which indicates a number of virtual customers standing at the end of the line. Thus, the number of levels is unbounded. Within each level, the system's states (each representing the system's phase) arranged in order are described in Feature 1. First, the phases are partitioned into groups, each of which is characterized by the state's length (i.e., the number of components in the state), which is given by k + 2, Within each group, the states are arranged in lexicographic order of each state's component's values. Then, each group of length k is itself partitioned into sub-groups specified by the value of c k , each of which is partitioned again into sub-groups specified by the value of c k−1 , and so on.
We now show in Proposition 1 that, for any given value of q 2 , each level consists of 2 q 2 phases. Proposition 1. For any given value of q 2 , each level consists of 2 q 2 phases.
Proof. We denote an accepting state any state in which a real customer can enter the system, i.e., any state in which the number of customers is less then q 2 . At each accepting state, increasing the number of real customers by one transfers the system to another phase. Thus, the number of phases is equal to number of accepting states plus the first phase. Let AS(q) be the number of accepting states with q customers, q = 0, 1, 2, . . . , q 2 − 1. Since, at each state with q customers, each customer can be either a real or a virtual customer, it follows that AS(q) = 2 q . Thus, for any given value of q 2 , the number of accepting states is Adding the first phase completes the proof.
(iii) From (r, v), r = 1, 2, 3, . . . , Cases (i), (iii), (vi), (vii), and (viii) indicate transitions within a given level, activated by, respectively, arrival of a real customer (case (i)), departure of a real customer (cases (iii) and (vi)), and clearance of a virtual customer (cases (vii) and (viii)). Cases (ii) and (x) indicate transitions from level i to level i + 1, i = 0, 1, 2, . . ., activated by arrival of a virtual customer. Case (iv) indicates transitions from level i to level i − 1, i = 1, 2, 3, . . ., activated by clearance of a virtual customer. Cases (v) and (ix) indicate transitions from level i to level 0, i = 1, 2, 3, . . ., activated by an arrival of a real customer. Let D i,0 , i = 0, 1, 2, . . . , q 2 − 1 denote the square matrix of order 2 q 2 × 2 q 2 composed of the transition rates from level i to level 0 activated by an arrival of a real customer (cases (i), (v) and (ix)), where, in each matrix D i,0 , the states are arranged in order described in Feature 1. Let D denote the square matrix of order 2 q 2 × 2 q 2 composed of the transitions within level, i.e., transitions caused by service completion (σ) and by clearance completion (µ), again, the states arranged in order described in Feature 1.

Steady State Analysis
Let Q denote the infinitesimal generator matrix of the Markovian process described above. The matrix Q is given by where the matrices B i,j , A i , all of size 2 q 2 × 2 q 2 , are given as follows , → e = 1 1 · · · 1 T denotes a column vector of ones, → u i denotes a column vector that indicates the number of customers of each of 2 q 2 phases in level i, arranged in order described in Feature 1, i.e., r + v + k + k ∑ i=1 c k , i = 0, 1, 2, . . . , q 2 − 1, X i,i + 1 2 q 2 × 2 q 2 = diag{u i }, diag{· · · } denotes the diagonal matrix with the diagonal entries listed in the brackets, and I ·×· denotes the identity matrix of having size indicated in the suffix.
Note that, following the order of phases described in Feature 1, the matrix A 1 is lower triangular. Let → p i , i = 0, 1, 2, . . ., be the probability vectors of the system's states of level i (arranged in order described in Feature 1). Further, denote the vector of all probabilities by Then, the balance equations can be written as For further analysis, the elements of matrices A 0 , A 1 , and A 2 are denoted as follows: , v = 1, 2, 3, . . . , 2 q 2 , t = 1, 2, 3, . . . , 2 q 2 . Theorem 1. The system's stability condition is λ < µ.
Proof. According to Hanukov and Yechiali [58], when each of the matrices A 0 , A 1 and A 2 are lower triangular (which is the case in our model), the stability condition is given by a 1,1 0 < a 1,1 2 , which, in our model, results in λ < µ.
Theorem 1 shows that the stability condition is not affected by real customers' service rate, σ. This result is explained by the fact that the number of real customers in the system is bounded. Let R be the matrix satisfying In general, the matrix R is calculated via successive substitutions; see [59,60]. However, in some special cases, the matrix R can be obtained directly. One case is when A 2 is of rank 1, satisfying c is a column vector and → r is a row vector normalized by → r · → e = 1 (see [21]). In this case, R can be calculated by T and → r = ( 1, 0, 0, · · · , 0 ). Another more general case (see [58]) is when each of the three matrices A 0 , A 1 , and A 2 is lower triangular, as is the case in the current model. In such a case, the entries of R ≡ [r v,t ] are given explicitly by Then, the steady state probability vectors satisfy In order to calculate those probability vectors, one needs first to obtain the vectors → p i , i = 0, 1, . . . , q 2 − 1. This is achieved by considering the corresponding vector equations from the set In the next two theorems, an alternative representation of stability condition is given.
The next theorem shows that the representation of the stability condition introduced in Theorem 2 is valid also when matrices A 0 , A 1 , and A 2 are upper triangular. Theorem 3. If each of the matrices A 0 , A 1 , and A 2 are upper triangular, the stability condition is given by r n,n < 1.
Proof. Since A 0 , A 1 , and A 2 are all upper triangular, then a n,n 1 = −(a n,n 0 + a n,n 2 ). Thus, according to the first term in Equation (2), we have r n,n = a n,n 0 /a n,n 2 . According to Hanukov and Yechiali [58], when each of the matrices A 0 , A 1 , and A 2 are upper triangular, the stability condition is given by a n,n 0 < a n,n 2 , which proofs the claim.
Note that r 1,1 and r n,n are, respectively, left-upper and right-lower entries of R.

Performance Measures
The system performance is evaluated through several variables: Customers are interested in their expected flow time (sum of waiting time and service time) and their probability to receive service (complementary to the balking probability). The server performance is evaluated by its utilization. Finally, we also measure the total (ticket) queue length.
As mentioned, the only difference between levels is the value of states' last component (v or c k ). Thus, when this value is excluded, each phase has the same number of customers in the system in all its states. Let → u Q be the column vector that indicates the number of customers of each of 2 q 2 phases, arranged in order described in Feature 1, i.e., T be a 2 q 2 -dimensional column vector. Let L be the mean total number of customers in the system, which can be calculated as follows: Let N be the mean number of real customers in the system. In order to calculate N, let → u N be the column vector indicating the number of real customers of each of 2 q 2 phases, arranged in order described in Feature 1, i.e., r + k. Thus, The effective arrival rate of real customers is given by: Then, using Little's law, the expected flow time of a customer who has decided to join the queue is obtained by: Let SL be the service level, which is the portion of customers who join the queue, and given by: The server utilization is given by: Let U e f f be the server effective utilization, defined as the proportion of time the server is busy processing real customers (excluding the calling time for virtual customers). We note that the portion of utilized time assigned to real customers can be expressed by: This allows formulating the effective utilization as:

The Early Information Policy (EIP)
In the EIP, the queue length is provided to the customer upon arrival, and the decision of leaving is taken before a ticket is issued. In this case, the ticket queue describes accurately the customer queue, and there is no loss of server capacity to call balking customers. Such a system is a birth-and-death process where the states represent the queue length of nonbalking customers. The arrival rate to a given state is the general arrival rate conditioned by the probability for the customer to stay in the queue. The birth process is, therefore, state-dependent with rate λ q = λ(1 − ψ(q)). The death process is state-independent with rate σ.
Let us denote π q , the steady-state probability of being in state q and ρ = λ/σ. Solving the balance equations of state q is straightforward (see Kleinrock [61], p. 92) and leads to: The expected service level can be expressed as: The expected value of the ticket queue length is: Based on Little's law with an average arrival rate of λ · SL, we can compute the expected flow time as: The server utilization is given by:

Numerical Results: LIP vs. EIP
In the following, all the calculations were conducted in Maple 2021. Our work focuses on the impact of providing delayed information on the system. We, therefore, compare each of the aforementioned five systems' performance under the EIP vs. their counterpart under the LIP over a range of traffic intensity values. We assume that the balking function is piecewise linear as follows: For the sake of illustrating the behavior of our model, we use the following parameter values as an example: λ = 25, σ = 20, µ = 30, q 1 = 1, q 2 = 3. Based on these values, the performance measures can be calculated by the following procedure: (i) we obtain the rate-matrix R by using Equations (1)-(3); (ii) then, we calculate the steady state probabilities using Equation (4); (iii) we use Equations (5)-(9) to get the following performance measures for the late information policy: L = 6.717, F = 0.089, SL = 0.213, U = 0.922, U e f f = 0.267; and, finally, (iv) we use Equations (10)- (13) to get the following performance measures for the early information policy: L ear = 1.525, F ear = 0.076, SL ear = 0.796, U ear = 0.791. We illustrate the improvement achieved by adopting the early information policy by calculating the ratio between corresponding performance measures as presented above.
Equation (15) shows that the mean number of customers present in the system decreases when adopting the early information policy. Similarly, Equation (16) shows that the mean time the customer spends in the system decreases with the EIP. Equation (17) shows that the service level is higher with the EIP. Equation (18) shows that, with the EIP, the server idle time increases; however, Equation (19) shows that, with the LIP, the server spends more time actually serving customers.
We expand the comparison study between the EIP and LIP policies by providing the following sensitivity analysis: all rates (µ, σ, λ) are measured in units (customers) per minute. The calling rate is kept constant at µ = 30, and the service rate σ was chosen in set {3, 10, 20}, resulting in ratios between both rates in {1.5, 3, 10}. The smaller the σ, the less significant the calling time compared to real customer service time. The arrival rate λ evolved relative to both service rates. Two values of λ were chosen slightly under σ, namely, λ ∈ {0.75σ, 0.95σ} to represent under-stressed scenarios, i.e., situations in which the queue would have been stable even without balking. The rest of the scenarios that we denote as stressed are characterized by an arrival rate higher than σ. In these cases, the balking mechanism allows for the system to stabilize. The arrival rate values were set as λ = σ + γ(µ − σ) with γ ∈ {0, 0.25, 0.5, 0.75, 0.9, 0.99}. The two highest λ values were chosen to express severe system congestion. The balking function's parameters are assumed to be q 1 = 1 and q 2 = 3. Figures 2-6 present, respectively, for each of the five performance levels, the ratio of the performance level under the EIP vs. its counterpart under the LIP as a function of the traffic level and for each of the three service rates. Notice that, because the ratio values span over a very large range for the service level and for the effective utilization, a logarithmic scale was used. Moreover, the abscissa present values of γ for stressed scenarios (between the two vertical lines marking γ = 0 and γ = 1), except for the two smallest values that stand for under stressed scenarios {0.75σ, 0.95σ}. Figures 2 and 4 clearly show the steep degradation in the performance of the LIP as the traffic intensifies and reaches saturation. At this point, under the EIP, both the service level and the effective utilization are more than two orders of magnitude higher than their LIP counterparts, meaning that the LIP essentially paralyzes the system at higher levels of congestion. Although it may be difficult to get a correct impression of the effects of the difference between the service and calling rates on the log scale, for stressed scenarios, the values of the effective utilization ratios for σ = 20 are between 36% and 47% higher than for σ = 3. This is somehow not a surprise since the longer the calling times relative to the service time, the less efficient the server is expected to be. All this happens while the server utilization levels stay comparable ( Figure 3). In terms of the customer experience, although the queue length under the EIP is shorter than under the LIP, with the gap between them increasing with the level of congestion ( Figure 5), Figure 6 shows that the flow time remains comparable between the two policies when the calling times are relatively small, with a clear advantage to the EIP when the calling rate gets closer to the service rate.
Mathematics 2021, 9, x FOR PEER REVIEW 13 of 18 less efficient the server is expected to be. All this happens while the server utilization levels stay comparable (Figure 3). In terms of the customer experience, although the queue length under the EIP is shorter than under the LIP, with the gap between them increasing with the level of congestion ( Figure 5), Figure 6 shows that the flow time remains comparable between the two policies when the calling times are relatively small, with a clear advantage to the EIP when the calling rate gets closer to the service rate.
To conclude this short comparative section, let us state that, while the server may experience about the same level of utilization under both the EIP and LIP and the sojourn time of the customers that decide to stay in the queue may also remain comparable, the difference between the two information policies is mostly felt in the service level, with the LIP crippling the system as the traffic increases up to almost total paralysis for heavy congestion.   less efficient the server is expected to be. All this happens while the server utilization levels stay comparable (Figure 3). In terms of the customer experience, although the queue length under the EIP is shorter than under the LIP, with the gap between them increasing with the level of congestion ( Figure 5), Figure 6 shows that the flow time remains comparable between the two policies when the calling times are relatively small, with a clear advantage to the EIP when the calling rate gets closer to the service rate.
To conclude this short comparative section, let us state that, while the server may experience about the same level of utilization under both the EIP and LIP and the sojourn time of the customers that decide to stay in the queue may also remain comparable, the difference between the two information policies is mostly felt in the service level, with the LIP crippling the system as the traffic increases up to almost total paralysis for heavy congestion.

Conclusions
In this work, we address the damage done to the performance of ticket queues. We first demonstrated how ignoring the creation of virtual customers proves to be a poor management policy. The lateness of information has a significant impact on most aspects of the system performance from the customer point of view (service level, flow time) while being slightly detrimental to the operation from the server point of view (utilization). We To conclude this short comparative section, let us state that, while the server may experience about the same level of utilization under both the EIP and LIP and the sojourn time of the customers that decide to stay in the queue may also remain comparable, the difference between the two information policies is mostly felt in the service level, with the LIP crippling the system as the traffic increases up to almost total paralysis for heavy congestion.

Conclusions
In this work, we address the damage done to the performance of ticket queues. We first demonstrated how ignoring the creation of virtual customers proves to be a poor management policy. The lateness of information has a significant impact on most aspects of the system performance from the customer point of view (service level, flow time) while being slightly detrimental to the operation from the server point of view (utilization). We then studied the sensitivity of the results to the system characteristics. The importance of the information policy on the service level was shown to be heavily dependent on the level of traffic intensity (the arrival rate relative to both calling and service rates). In cases where the calling times and service times are close, we showed that the late information policy is even more damaging to the system and that the loss in terms of the flow time and effective utilization is even greater as compared to the early information policy.
The practical solution to the costly phenomenon presented here may seem simple to implement: try to provide the customer with the highest quality of information prior to her entry in the queue. In our simplified model, this meant informing the customer of the queue size before printing a ticket. However, customers tend to leave not only upon arrival but also if the waiting time is too long or the apparent queue progression is too slow (reneging). The damage in terms of the calling time is exactly the same, which means that every effort to reduce the creation of virtual customers as much as possible should be considered: improving the information quality by giving an estimate of the waiting time, allowing for customers to go carry out errands and be prompted by messaging about their approaching time of service, allowing them to fix "appointments", or making their stay more comfortable.
Another type of solution could involve eliminating virtual customers from the ticket queue, for example, by prompting them to rescan their ticket close to their time of service, or, if a time comes when queues are managed through customer mobile devices, by identifying customers that leave the queue.
While some of these solutions may be costless on new systems (it only takes simple programming to display the estimate of the expected waiting time), implementing them on existing systems may require significant expenses. This is even more relevant when considering possible improvements to the waiting room. The proposed analysis allows one to estimate the worthiness of any such improvement.
The proposed methodology could be broadened in several different directions. First and foremost, future research can tackle the problem of reneging in the same type of partially observable queues with calling times and using similar tools. Second, by using simulation to overcome analytical difficulties, we could tackle more complex, maybe intractable variations on the present system by including, for example, both balking and reneging, blocking, multiple servers, etc. Finally, it would only be natural to bring the ideas presented here to the field, conduct naturalistic studies on real-life instances of this problem, and consider the effectiveness of different solutions.