Coupon Collector Problem with Reset Button

: We consider the following generalization of the classical coupon collector problem. We assume that, in addition to the initial collection of standard coupons, there is one more coupon that acts as a reset button, removing all coupons from the part of the collection that has already been drawn. For the case where standard coupons have unequal probabilities of being drawn, we obtain the distribution of the waiting time until the end of the collection process. For the case where standard coupons have equal probabilities, we derive a simple formula for the expected waiting time in terms of the beta function, and discuss the asymptotic properties of this expected waiting time, when the number of standard coupons tends toward infinity.


Introduction
In the classical coupon collector problem (CCP) the goal is to analyze the waiting time (in days) for a collector, who buys a coupon each day, in order to complete a full collection of n distinct coupons from the set N n = {1, 2, . . ., n}, assuming that each coupon is drawn with probability 1  n .Similarly, the coupon collector problem with unequal probabilities (CCPU) refers to the situation where the coupon j ∈ N n is drawn with probability p j and ∑ n j=1 p j = 1.
The coupon collector problem can be formulated in different ways.In its simplest form, it appears in the well known work [1], but it can also be formulated and treated as a special kind of urn model (see, for example, [2,3]) or in the context of formal languages [4].On the other hand, several results related to the waiting time problems have been obtained using Markov chain techniques (see, for example, [5,6]).The coupon collector problem and its generalizations also led to various types of asymptotic results (see, for example, [7][8][9]).
We consider the following generalization of the coupon collector problem, which, to our knowledge, has not been considered before.We assume that there is a special coupon (we call it the reset coupon) that does not belong to N n and acts as a reset button, in the sense that the set of coupons drawn up to time (day) t becomes empty when the reset coupon is drawn on day t + 1.After that, the collection can start again from the beginning (or not).Therefore, we work with an augmented set of available coupons, N (⊗) n = {1, 2, . . ., n, ⊗}, where ⊗ denotes the reset coupon.
We call this version of the problem coupon collector with reset button and refer to it as CCPRB in the rest of the text.
We assume, in sampling with replacement, that the probability of obtaining the reset coupon is p R , 0 ≤ p R < 1, and that the probabilities of obtaining standard coupons (from the set N n ) sum up to 1 − p R .Obviously, the problem reduces to CCP or CCPU when p R = 0.
The CCPRB problem we consider belongs to the group of generalizations of CCP that are based on the idea of introducing additional coupons with special purposes into the coupon set.Other generalizations of CCP of this type are analyzed in [5,[10][11][12][13].
The variant of the coupon collector problem, where each coupon can have several purposes (so called, goals), is considered in [13].In this case, the experiment ends when the sum of the numbers of goals reaches a certain limit.
Another generalization of the classical coupon collector problem is proposed in [11], where the appearance of an additional coupon (so called, bonus coupon) leads to obtaining one more coupon.
In [12], the author considers the case where the additional coupon (so called, penalty coupon) interferes with collecting standard coupons, in the sense that the collection process ends when the absolute difference between the number of collected standard coupons and the number of collected penalty coupons is equal to the total number of standard coupons.
In this work, we refer to results related to the coupon collector problem with a null coupon, as considered in [5,10].This is the situation where the probabilities of the standard coupons sum up to ∑ n j=1 p j = 1 − p N < 1, or, equivalently, there is a null coupon, that can be drawn with probability p N , but does not belong to any collection.This variant of the problem reduces to CCPU when p N = 0.
It is well known that the coupon collector problem has various applications in engineering (see, for example, [14]).In particular, the CCPU has recently been used in biology, to model parasitism (explained in [15]), and in telecommunications, to model the transmission of information in computer networks [16], and to analyze Internet security problems (analyzed in [5,10,17]).
The structure of this paper is as follows.In Section 2, we obtain the general properties of the waiting time for a full collection of standard coupons, in the case of CCPRB with unequal probabilities.In Section 3, we obtain the expected waiting time for a full collection in the case of equal probabilities, and derive its relation to the beta function.In Section 4, we provide some numerical examples.In Section 5, we discuss the asymptotic properties of the expected waiting time for a full collection in the case of equal probabilities, for different values of p R and when n tends to infinity, and give some specific examples.The conclusions are given in Section 6.

Distribution of the Waiting Time for a Full Collection in General Case
Here, we derive the distribution of the waiting time until a full collection of standard coupons is sampled in the case of unequal probabilities (where each coupon j ∈ N n is drawn with probability p j , such that ∑ n j=1 p j = 1 − p R ).Let W (⊗) n denote the waiting time until a full collection of standard coupons in CCPRB are collected, and W n denote the corresponding waiting time in the coupon collector problem with unequal probabilities and no additional coupons.We will also use the notation N 0 = {0, 1, 2, . . .} and P J = ∑ j∈J p j .
The distribution of the waiting time W n is a well known result, (see, for example, [5], Theorem 1, p. 409).
The corresponding result related to the waiting time is obtained in the next theorem.In the rest of the text, we will use the abbreviation D i,k for Theorem 1.For every k ≥ 0, for the waiting time W n , the following relations hold: 1.

P{W
P{W Proof.

1.
Each sequence draws up to time k can be presented in the form where R represents a single reset coupon and B j , j ∈ {1, 2, . . ., i + 1} represent blocks of standard coupons, such that the length of the block B j is k j ≥ 0, ∑ i+1 j=1 k j = k − i, and none of the blocks B j consist of a full collection of standard coupons.We define the events as follows: Therefore, On the other hand, we have since all the blocks B j consist of incomplete collections of standard coupons, their lengths sum up to k − i, and the appearance of the blocks of any combination of lengths are mutually independent events.This completes the proof of the statement.

2.
This is a simple modification of the first part of the theorem.Each realization of the experiment has the form where none of the blocks B j , 1 ≤ j ≤ i consists of a full collection of standard coupons, and the block B i+1 consist of the full collection of standard coupons.Using the fact that we complete the proof of the theorem.
Remark 1.The sequence of draws in the CCPRB can be seen as a renewal process, as the coupon collection starts over after each reset.More precisely, the events of the type B j R, defined in (5), can be regarded as recurring events, in the sense of Definition 1, p. 308 in [1].
Example 1.If n = 2, p R = 1 7 and p 1 = p 2 = 3 7 , the probability that the full collection of standard coupons has not been drawn by the time (day where Therefore, Remark 2. The expected waiting time for a full collection can be obtained as where P{W Remark 3. It is well known that, for large n, the computation of probabilities associated with the coupon collector problem with unequal probabilities, such as (3), becomes computationally intensive, and requires some sort of approximation or bounds.This is even more obvious in the case of CCPRB.However, the upper and lower bounds for the probability (3) can be obtained directly by applying the corresponding upper and lower bounds for the probability (2) in (3).For a detailed discussion on this topic, and a comprehensive list of upper and lower bounds on the probability (2), see [17].Observing that we obtain an additional, simple lower bound for the probability P{W n,c denote the waiting time until c, 1 ≤ c ≤ n, out of n coupons in CCPRB are collected.For this waiting time, results analogous to Theorem 1 and its consequences can be derived using the same technique.

Expected Waiting Time for a Full Collection in the Case of Equal Probabilities
The expected waiting time for a full collection, or a subcollection, can be obtained from (10).However, if we assume that all the standard coupons have an equal probability p = 1−p R n of being drawn, the expected waiting time for a full collection has a simpler form, which is conveniently derived using the Markov chain technique.
Let X t be the number of different types of standard coupons sampled after t units of time (days).We can notice that {X t , t ∈ N} is a Markov chain on the state space: Depending on how we define the end of the collection process, we can distinguish between two characteristic cases.

Case 1: The Collector Gives up Collecting after the First Reset
In the first case, the collector starts with a certain number of coupons, and buys coupons until he completes his collection, or the first reset happens.
The transition probability matrix is and is the waiting time until absorption, starting from the state k.

Case 2: The Collector Keeps Collecting after the Reset
In this case, the collector buys coupons until he completes his collection, regardless of how many resets occur in the meantime.
The transition probability matrix is and is the waiting time until absorption, starting from the state k.
The expected waiting times v k and u k , k ∈ S are obtained in the next theorem.

1.
For the expected waiting time v k in Case 1, Section 3.1, the following relations hold: where Γ(•) denotes the gamma function.

1.
Applying the first step analysis to the Markov chain with the transition probability matrix ( 14), we obtain that The recurrence (20) is solved in [18] (problem 6.2., p. 165): where Next, we have Now, we can rewrite (21) as which completes the proof of the statement.

2.
Applying the first step analysis to the Markov chain with the transition probability matrix ( 16), we conclude that and Applying the substitution we obtain the Equation (20), and the solution is given by (21).From (26), it follows that and which completes the proof of the theorem.
Example 2. The case p R = 0 is the CCP, and the waiting time u 0 becomes the expected waiting time in CCP.From (24) with k = 1 we have n−1 (0), Using p R = 0 in (29) and ( 27) leads to the well known result, related to harmonic numbers: n (0).
The expected waiting times v 1 and u 0 are the most general, in the sense that the collector has to wait for the almost full or full collection.Next, we will provide simplified expressions for the waiting times v 1 and u 0 for the case p R > 0. For that purpose, we need the next lemma.
For any m ∈ N, the following equality holds: Proof.

1.
We use a mathematical induction on m.For m = 1, we easily confirm that the equality (33) is valid.Next, assuming that (33) holds for m − 1, we check that (33) holds for m.We have which completes the proof of the statement.

Numerical Examples
In this section, we provide numerical examples for the CCPRB with equal probabilities, as analyzed in Section 3. We assume that the set of available coupons is N 10 .We consider different values of the probability p R and calculate the expected waiting time u 0 = u (10) 0 for this case using formula (27).The results are shown in Table 1.Statistical Software R, version 2023.03.0+386 was used for all calculations.Next, we show how the expected waiting time u 0 depends on the probability p R for different values of n.
Note that the behavior of the expected waiting time u 0 , depicted in Figure 1, is consistent with the intuition we have about CCPRB: u 0 increases as n increases (as having more standard coupons to collect extends the waiting time), and u 0 increases as p R increases (as resets remove the coupons already collected, and therefore extend the waiting time).In some cases considered, we can also notice some kind of exponential growth, which we discuss in more detail in the next section.

Asymptotic Properties of the Waiting Times v 1 and u 0
Here, we analyze the properties of the expected waiting times until the end of the collection process, as the number of standard coupons n tends to infinity, for different values of the probability p R .We can distinguish between the case when p R is fixed, and the case when p R depends on n.
For fixed p R ∈ (0, 1), we can apply the Stirling approximation of the term B n, np R 1−p R in Theorem 3, and obtain the asymptotic estimate for u 0 , as n → ∞, formulated in the next proposition.
Proposition 1.For any p R ∈ (0, 1), the following asymptotic relation holds as n → ∞: Remark 5.The expression (43) can be alternatively written in the following form: The relations (43) and (44) also hold when p R depends on n, in the case when the ratio np R 1−p R = p R p tends to infinity as n → ∞.
For the case when the ratio np R 1−p R is bounded (which means that the collection process is not interrupted too often by resets), we have the following asymptotic result.

Proposition 2. For
This case is "exactly solvable", in the sense that, for a given value u 0 , we can simply obtain n = n u 0 such that the expected waiting time is less or equal to u 0 .Precisely, from the inequality n 2 + n ≤ u 0 and the fact that n u 0 ≥ 0, we obtain that Example 4. The case p R = 1 2 corresponds to the situation where the ratio np R 1−p R is equal to n.Using Theorem 3, we obtain the expressions and

Lemma 1 .
Let (a) k denote the falling factorial:

Figure 1 .
Figure 1.Expected waiting time u 0 in terms of p R for different values of n.

Table 1 .
Expected waiting time u