Abstract
We study the asymptotic duration of optimal stopping problems involving a sequence of independent random variables that are drawn from a known continuous distribution. These variables are observed as a sequence, where no recall of previous observations is permitted, and the objective is to form an optimal strategy to maximise the expected reward. In our previous work, we presented a methodology, borrowing techniques from applied mathematics, for obtaining asymptotic expressions for the expectation duration of the optimal stopping time where one stop is permitted. In this study, we generalise further to the case where more than one stop is permitted, with an updated objective function of maximising the expected sum of the variables chosen. We formulate a complete generalisation for an exponential family as well as the uniform distribution by utilising an inductive approach in the formulation of the stopping rule. Explicit examples are shown for common probability functions as well as simulations to verify the asymptotic calculations.
Keywords:
sequential decision analysis; optimal stopping; multiple optimal stopping; secretary problems; asymptotic approximations MSC:
62L15; 60G40; 41A60
1. Introduction
Optimal stopping problems are formulated in terms of observing random variables and determining the stopping time(s) in order to maximise an objective function (which can be thought of as some reward). The discrete single-stopping problem involves observing a sequence of random variables , and making the choice to stop on a particular observation , where , which is based only on the variables that have been previously observed. After stopping, the “player” receives some pay-off which is a function of the variables observed . In [1], we used the pay-off of (i.e., the value of the variable stopped on). To extend this to the multiple optimal stopping problem we stop on variables and, after stopping on variables at times , we then receive a gain which is a function of the variables observed. This problem is a subset of a general class of other optimal stopping problems that all aim to find a sequential procedure to maximise the expected reward (see Section 13.4 of [2] for a more extensive discussion of this class of problem). The secretary problem is arguably the most well known (see [3,4]), and it has a wide range of variations (see [5,6]), but there is also a rich literature of other examples (see [7]). By the ‘duration’ (sometimes referred to as ‘time’) of the stopping problem, we refer to how many observations the statistician observes before optimally stopping. It should be emphasised that knowing how long you would be waiting, on average, for potentially large sequences of observations is also useful to understand (see [8]), which highlights the need for asymptotic analysis to address this question.
Less focus has been placed on understanding the asymptotic behaviour of the stopping duration, with most pre-existing results focusing on secretary-type, so called “no-information”, problems where the distribution of the observations is unknown. The asymptotic expectation and variance of the stopping time for the secretary problem were studied in [9]. Similar asymptotic analyses for other variants of no-information problems can be found in [10,11,12,13], where the techniques/formulations used differ depending on the particular variation or structure of the problem.
There is substantially less literature describing the asymptotic behaviour for “full-information” problems, when the distribution of the variables is known a priori. A smaller subset addresses the multiple stopping problem, which we focus on in this study.
Gilbert and Mosteller [11] studied the optimal stopping strategy for the full-information problem in which the objective is to maximise the probability of attaining the best observation, known as the full-information best-choice problem [14]. The optimal rule was shown to be a threshold strategy wherein the player would stop on if it was the best observed so far after all other observations and its value exceeded a threshold that depended on m. The asymptotic behaviour of this rule was also derived.
In the full-information case, the pay-off can instead be in terms of the actual values of the variables stopped on. A special case of this is the uniform game (see Section 5a of [11]), which is closely related to Cayley’s problem (see [3,15]). In [11], the authors showed an asymptotic expression for the expected reward of a sequence of n independent and identically distributed (iid) random variables having the standard uniform distribution (see also [15]). In [9], Mazalov and Peshkov found the asymptotic behaviour of the expected value and variance of the stopping time as and , respectively, when the variables are from the uniform distribution. However, the techniques applied were specific to the structure of the distribution, and thus, it is difficult to easily extend or generalise this to other distribution functions. In [16,17], using the extreme value theory, Kennedy and Kertz proved limit theorems for threshold-stopped random variables and derived the asymptotic distribution of the reward sequence of the optimal stopping (iid) random variables. The asymptotic pay-off for the multiple case is briefly analysed in (Section 5c of [11]).
In [1], we outlined a novel approach for a general asymptotic technique for calculating the asymptotic behaviour of the pay-off as well as and in the single-stopping case, where is the single-stopping time, as for general classes of probability distributions in the full-information problem where we wish to maximise the expected reward . The techniques in our previous paper, which are extended in this study, employ the asymptotic analysis of difference and differential equations in order to establish and solve asymptotic differential equations for the quantities of interest. Differential equations were also used in [18,19]. In this study, we extend some of our results in [1] to the multiple stopping case through some inductive arguments and verify these results with simulations. For simplicity, we only analyse continuous distributions and reserve the notation , and for the continuous probability density, cumulative distribution, and “survivor” function, respectively. We use the notation ∼ to establish the asymptotic relation that .
2. Formulation of the Multiple Optimal Stopping Problem
As in the single-stopping problem, we still sequentially observe the sequence of independent, identically distributed (iid) random variables from a known distribution, we must decide which k of these variables to stop on. After k, stoppings at times , where , we receive the gain The random variable can be interpreted as the value of some asset, such as a house, at time m. The problem of selling k identical objects in a finite time (or horizon) N with one offer per time period and no recall of previous offers is analogous to the multiple optimal stopping problem described. If we stop at after observations (,,…, ), then we proceed to observe another sequence (whose length depends on ) and must solve the new optimal stopping problem on this sequence. From [20], we have the following theorem.
Theorem 1.
Let be a sequence of independent random variables with known cumulative distribution functions (cdfs) . Let be the value, which is the optimal expected reward, of a game with l, stoppings and L, steps. If there exist , then the value v of the ‘game’ is , where
We put
then, is the optimal stopping rule.
If we define , then we may notice that the stopping rules now take the form
so we may interpret as the appropriate threshold value that needs to be satisfied for stopping for the ith occasion at the jth term in the sequence of N observations. We can then define , which can now be interpreted as the probability that in the above situation we do not stop.
Example 1 (Multiple Stopping on the Uniform (0, 1) Distribution with stops).
Let be a sequence of independent, identically distributed random variables that follow the uniform distribution.
We derived the equations for in the original paper (for details, see [1]):
where . For we have that
which can then be used to numerically determine the values of the ‘game’.
For the example of , we may produce the values for and for each value of n from to , as displayed in Table 1. For example, consider this sequence of simulated variables:
For the first stop, we would arrive on , since is the first variable to satisfy and for the second, we arrive on , as this is the first subsequent variable for which . This particular example would have resulted in the reward
Table 1.
Values for the standard uniform distribution.
3. Computing Behaviour
We establish a similar recurrence result as we did for the single-stopping case. We note that this relation for will now be a second-order relation. For a convenient evaluation of the expectation, we use the fact that for any continuous integrable random variable X with cdf , the expectation can be given by
Theorem 2.
Let Y be an integrable random variable whose expectation exists, and which is drawn from a continuous probability distribution function (pdf) with survivor function . The value of a sequence with steps and k stops remaining is given by
Proof.
For ease of notation, we let . Then, by definition, we have
where the last expectation, using (3), is given by
Substituting this result, noting that the terms cancel, we obtain the required result. □
We note that if has bounded support in the positive direction such that for , it follows from above that
By the controlling factor method, we also have that , which can be combined with the previous integral to establish the following:
This may be differentiated on both sides to obtain the asymptotic relation for the second derivative:
This expression can now be rearranged for to give
Depending on the asymptotic nature of and their derivatives, this result can be used through direct substitution. In other scenarios, the derivative expressions may not yield useful expressions and the behaviour of can be analysed directly without this result. We will provide variations of this in the subsequent example calculations.
Example Calculations
We illustrate the application of these ideas to some common distributions, such as the uniform and exponential distributions. However, the nature of the differential equations that arise from the multiple stopping problem are much harder to solve—some of them have no solution.
Example 2.
The uniform distribution is given by on , where .
For the differential equation for the large asymptotic behaviour, we have that . Defining and rearranging in the asymptotic relation for we obtain
From [1], we have the asymptotic relation , which can be directly substituted into the above equation:
We can solve this formal differential equation to obtain
where c arises as a constant of integration but may be dropped since it is part of a sub-dominant term. We now replace v with , substitute our asymptotic relation for , and rearrange for to obtain
We may notice, in general, that whenever v satisfies the relation
where is some positive constant, that we obtain the asymptotic relation
This can be used to generalise the asymptotic behaviour for for the uniform distribution.
Theorem 3.
Consider to be independent identically distributed uniform variables on , . The reward sequence follows the asymptotic relation
where for and .
Proof.
We have shown this to be true for and . Now, assume that
Now define , we have
which, from (10) and (11), has the asymptotic solution
To conclude the proof, we rearrange for to obtain
□
We noted in [1] that the behaviour of was identical for the family of distributions with exponential tails. The next example will seek to unify such distributions that corresponded to in the multiple stopping scenario.
Example 3.
A continuous probability density function is given with a survival function that, for sufficiently large y, satisfies
for positive Δ, and where β and γ are positive constants. Assume each of the terms in the sequence of reward values increases without bound.
We obtain the ordinary differential equation
where represents the upper incomplete gamma function, and
which is sub-dominant in the asymptotic differential equation. The solution to the differential equation is thus approximated by
This gives
We have, from [1], , and so the general case for may be presented by mathematical induction.
Theorem 4.
Let be random variables from a distribution that, for sufficiently large y, satisfies
for positive Δ, and where β and γ are positive constants. Assume each of the terms in the sequence of reward values increases without bound. Then, the asymptotic behaviour of is given by
Proof.
From (15) we have that the behaviour of is related by
We have verified the claim for in [1]. We now assume that as and use this to prove the same for . We write the asymptotic differential equation and substitute for our assumed asymptotic for to obtain
which is a separable differential equation with solution
as required. □
4. Calculating the Optimal Expectation
In this section, we continue the ideas from the single-stopping calculation to calculate the expectation for the multiple stopping rules. We proceed to find the asymptotic for the expected value , the first stopping time. We can then find the rest of the expectations in an inductive fashion under certain conditions. We now extend some of the previous notation to reflect the multiple reward sequences, as well as k stopping variables. Let denote the 1st, 2nd, …, kth stopping times, respectively, and let .
5. An Asymptotic Equation for
By recalling the notation from the beginning of this section, we obtain through
We split the summation for on a value , where :
where we apply the fact that to obtain a bound for the first summation term:
For the second summation term, as is large in the limit as , we may use the asymptotic approximations for obtained in the previous section.
For many distributions, this can be simplified through using the large-n asymptotic for and its derivatives, or obtaining an asymptotic expression for through other means. In the case where , we have from [1] that
Example 4.
Uniform Distribution for k stops.
For simplicity, we first consider the double stopping problem (). From Example 2, we have that
This obtains , and thus,
We then have that
For the general result with stops for the uniform distribution, we apply the asymptotic behaviour of to obtain
For , conveniently , this retrieves . For , we have that , and so, this retrieves , consistent with our previous results.
Example 5 (Distributions with an Exponential Tail (k stops)).
We once again consider a probability distribution is given with a survival function that, for sufficiently large y, satisfies
where the additional conditions are described in Example 3.
We found that the sequences satisfy the asymptotic relation and so
From this we obtain asymptotic relations for the derivatives:
Hence, we obtain
Provided that is asymptotically of this form, we may make use of linearity of expectation to obtain convenient conditional formulae that are not as complicated as those encountered in the previous section.
6. An Inductive Approach for
Due to the independence of observations, it starts to make physical sense to view the expectation of the th stopping time as a function of only the previous stopping time. We, thus, investigate the properties of the th stopping time when conditioned on the jth. We would expect then to only need to add the additional expected number of observations to stop one more time out of a ‘reduced’ optimal stopping problem. We now introduce the notation to allow for more flexible interactions between stopping times.
Definition 1.
Let denote the jth stopping time (out of k) in the optimal stopping problem with N observations.
Here, denotes the used in the previous section. corresponds to the in the single-stopping problem.
Theorem 5.
Let be independent and identically distributed random variables where the expectations of the optimal stopping times exist. Suppose further that the first stopping time out of k stops has asymptotic expectation for some constant . Then, the following relation holds:
where is some other positive constant.
Proof.
We first show that the conditional expectation, , is given by
where the last equality follows directly from the definition of in the previous section. By the assumption of the form of the expectation, we thus have that
and so through linearity properties of expectation, as well as the law of total expectation, we finally obtain
□
This has a reasonable physical interpretation as we would expect that the expectation of the th stopping time will be the expectation of the jth stopping time as well as the additional time it would take to stop one more time in the revised stopping problem to stop at the next observation. This revised stopping problem has a reduced number of observations on average, as well as stops remaining since we have already stopped j times out of the original k times. A consequence of this theorem is that the asymptotic expectations may all be considered linear.
7. Asymptotic Equations for
We now demonstrate how the ideas discussed in the previous section can be applied to obtain all of the multiple stopping times for some classes of distribution. Simulations were also conducted to support the asymptotic calculations in this study. Figure 1 compares the large-N asymptotic predictions for the expected duration for the standard uniform distribution under the optimal double stopping rule and Figure 2 shows a similar comparison under a triple stopping rule for the exponential and the gamma distribution. In this section, we will now adopt the previous notation for the jth stopping time out of k, for N observations as .
Figure 1.
Comparison of large-N asymptotic predictions for the expectation of the optimal double stopping rule for the standard uniform distribution ().
Figure 2.
Comparison of large-N asymptotic predictions for the expectation of the optimal triple stopping rule for the standard exponential distribution () and the gamma distribution. (). The rows correspond to the first, second, and third stops, respectively. The first column is for the exponential distribution, and the second for the gamma distribution.
These results were verified by simulation (see Figure 1) which show the realised expectation converging to the asymptotic prediction. This procedure can be extended to calculate further results for more stopping times permitted. Simulation results (see Figure 2) for the exponential and gamma distribution further support these asymptotic calculations.
Example 6 (Double Stopping on the Uniform Distribution).
For the uniform distribution, we obtain an asymptotic expression for the first stopping time out of k stops. It is given by
where is defined through the recursive scheme
We can then apply Equation (21), for the example of the double stopping case, recursively to obtain and .
Example 7 (Multiple Stopping on Distributions with Exponential Tails).
We saw that the expressions can get somewhat out of hand for the uniform distribution. However, the structure can sometimes behave nicely to lead to a closed form result that need not be recursively evaluated. This is true for the class of distributions outlined in Example 3. We prove the following theorem for the general asymptotic expectation:
Theorem 6.
Let be independent and identically distributed random variables. Define to be the jth stopping time out of k stops for the sequence of N observations. If , then we have that
Proof.
We see that the base case is in agreement; we now proceed with the inductive argument that
□
8. Conclusions
In this paper, we have derived asymptotics of multiple optimal stopping times for sequences of independent identically distributed continuous random variables through extending the pre-existing methodology that we developed for the single-stopping case. It is anticipated that a similar class of results can be established for other classes of distribution, but it is not clear if the resulting differential equations are easily solvable.
Asymptotic calculations were performed on a number of probability distributions, on both bounded and unbounded domains. The asymptotic properties for were then inductively obtained. Numerical simulations were subsequently performed to calculate the expectation of the optimal stopping rule for a range of values of N, ranging from 10 to 1000. In each case, the simulated results tended towards the asymptotic prediction in the large-N limit, validating the asymptotic and inductive approach.
Author Contributions
Conceptualisation, H.N.E., C.J.L. and G.Y.S.; methodology, H.N.E., C.J.L. and G.Y.S.; simulation, H.N.E.; writing—original draft preparation, H.N.E.; writing—review and editing, H.N.E. and G.Y.S.; visualisation, H.N.E. and C.J.L.; project administration, G.Y.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Data Availability Statement
The datasets simulated during the study are available from H.E. on reasonable request.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Entwistle, H.N.; Lustri, C.J.; Sofronov, G.Y. On asymptotics of optimal stopping times. Mathematics 2022, 10, 194. [Google Scholar] [CrossRef]
- DeGroot, M.H. Optimal Statistical Decisions; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 82. [Google Scholar]
- Ferguson, T.S. Who solved the secretary problem? Stat. Sci. 1989, 4, 282–296. [Google Scholar] [CrossRef]
- Freeman, P.R. The Secretary Problem and Its Extensions: A Review. Int. Stat. Rev. 1983, 51, 189–206. [Google Scholar] [CrossRef]
- Kubicka, E.M.; Kubicki, G.; Kuchta, M.; Morayne, M. Secretary problem with hidden information; searching for a high merit candidate. Adv. Appl. Math. 2023, 144, 102468. [Google Scholar] [CrossRef]
- Presman, E.L.; Sonin, I.M. The best choice problem for a random number of objects. Theory Probab. Its Appl. 1973, 17, 657–668. [Google Scholar] [CrossRef]
- Chow, Y.S.; Robbins, H.; Siegmund, D. Great Expectations: The Theory of Optimal Stopping; Houghton Mifflin: Boston, MA, USA, 1971. [Google Scholar]
- Ernst, M.; Szajowski, K.J. Average number of candidates surveyed by the headhunter in the recruitment. Math. Appl. 2021, 49, 31–53. [Google Scholar] [CrossRef]
- Mazalov, V.V.; Peshkov, N.V. On asymptotic properties of optimal stopping time. Theory Probab. Its Appl. 2004, 48, 549–555. [Google Scholar] [CrossRef]
- Demers, S. Expected duration of the no-information minimum rank problem. Stat. Probab. Lett. 2021, 168, 108950. [Google Scholar] [CrossRef]
- Gilbert, J.P.; Mosteller, F. Recognizing the Maximum of a Sequence. J. Am. Stat. Assoc. 1966, 61, 35–73. [Google Scholar] [CrossRef]
- Yasuda, M. Asymptotic Results for the Best-Choice Problem with a Random Number of Objects. Appl. Probab. 1984, 21, 521–536. [Google Scholar] [CrossRef]
- Yeo, G.F. Duration of a secretary problem. J. Appl. Probab. 1997, 34, 556–558. [Google Scholar] [CrossRef]
- Gnedin, A.V. On the Full Information Best-Choice Problem. J. Appl. Probab. 1996, 33, 678–687. [Google Scholar] [CrossRef]
- Moser, L. On a problem of Cayley. Scr. Math. 1956, 22, 289–292. [Google Scholar]
- Kennedy, D.P.; Kertz, R.P. Limit Theorems for Threshold-Stopped Random Variables with Applications to Optimal Stopping. Adv. Appl. Probab. 1990, 22, 396–411. [Google Scholar] [CrossRef]
- Kennedy, D.P.; Kertz, R.P. The asymptotic behavior of the reward sequence in the optimal stopping of iid random variables. Ann. Probab. 1991, 19, 329–341. [Google Scholar] [CrossRef]
- Bayón, L.; Fortuny Ayuso, P.; Grau, J.; Oller-Marcén, A.; Ruiz, M. A new method for computing asymptotic results in optimal stopping problems. Bull. Malays. Math. Sci. Soc. 2023, 46, 46. [Google Scholar] [CrossRef]
- Pearce, C.E.; Szajowski, K.; Tamaki, M. Duration problem with multiple exchanges. Numer. Algebr. Control Optim. 2012, 2, 333. [Google Scholar]
- Haggstrom, G.W. Optimal sequential procedures when more than one stop is required. Ann. Math. Stat. 1967, 38, 1618–1626. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

