Trapping the Ultimate Success

We introduce a betting game, where the gambler aims to guess the last success epoch from past observed data. The player may bet on the event that no further successes occur, or choose a `trap' which is any span of future times. In the latter case winning is achieved if the last success turns out to be the only one falling in the trap. The game is closely related to the sequential decision problem of maximising the probability of stopping on the last success in a finite sequence of trials. We use this connection to analyse the problem of stopping at the last record for trials paced by a Polya-Lundberg process with log-series distribution of the total number of trials.


Introduction
Suppose a series of inhomogeneous Bernoulli trials with given profile of success probabilities p = (p k , k ≥ 1) is paced randomly in time by some independent point process.As the outcomes and epochs of the first k ≥ 0 trials get known at some time t, the gambler is asked to bet on the time of the last success.The gambler is allowed to choose from three strategies: 1) a trap strategy where the gambler wins when the last success epoch gets isolated in a proper set of future times, that is, it falls in the trap while no other success epoch occurs.2) A next strategy, where winning is achieved if exactly one success happens in the future and 3) a bygone strategy, where the gambler wins if no further successes occur.
A classic profile is related to the random records model where the trials can be uniquely ranked and the exchangeability of ranks entails p k = 1/k.For such profile, trapping is an instance of a stopping strategy in the best choice problem where the objective is to recognise the overall best trial, the last record, at the moment it occurs [3,4,10,14,22,26,27].Other choices of p are suggested by random combinatorial structures and many other areas where inhomogeneous Bernoulli trials play an eminent role [28].
The analogous trapping game with discrete time is amenable to study by means of the optimal stopping theory for Markov chains.As a consequence, the state space for the sequence of successes is just the set of natural numbers.Thus, every Markovian stopping time coincides with a trapping strategy determined by a set of integers.The problem with fixed number of trials and general p has been discussed in previous research [8,25], and [23] treats the best choice problem with a random number of trials.However, these previous setups are different from the continuous time game.Both the index of trial and its time are important decision variables.
Regarding the pacing point process, we shall assume that it is mixed binomial without multiple points.The assumption entails that the pair (t, k) is a sufficient statistic summarising the observed data before time t.The setting covers the wide class of mixed Poisson processes and many others.In a nutshell, the pacing process is characterised by the prior distribution π of the total number of trials, and some background continuous distribution of i.i.d.'arrivals'.Without loss of generality, the model is standardised by assuming that the arrivals are uniformly distributed.That is to say, whenever the number of trials is n, they are paced at locations of the uniform order statistics on [0, 1].
The most obvious instance of a trapping strategy is a z-strategy.For 0 < z < 1, this leaves the (1 − z) proportion of the remaining time to trap the last success.The edge value z = 0 corresponds to the action next.We will give a simple condition on p to ensure that a z-strategy is optimal among all trapping strategies.
A question of central interest in this paper is the characterisation of pairs (p, π) which admit that for some states (t, k), the action bygone outperforms next but a proper trapping is better still.The question is motivated by the optimal stopping problems, in which a gambler's online strategy is an arbitrary adapted stopping time, and the objective is to stop at the last success epoch.If the stopping problem belongs to the so-called monotone case [13], the optimal strategy is myopic, that is prescribing to stop at the earliest success epoch when bygone becomes more beneficial than next.Therefore, in the monotone case trapping cannot be better than both bygone and next.However, trapping can be used to assess if the stopping problem belongs to the monotone case.
It is inherent in the model to consider each prior within the context of a family of power series distributions with given shape weights (w n ), and scale parameter q > 0. This allows one to define the critical cutoffs for trapping and the stopping problem in terms of roots of certain power series in the variable x = (1 − t)q.
The random records model with geometric prior has a special feature that the point process of record epochs is Poisson.Then the optimal strategy has a single cutoff approaching 1/e as q → 1 [9,11,12].The limit form, commonly called the 1/e-strategy, coincides with z-strategy for z = 1/e referred to the decision time t = 0.It is known that the 1/e-strategy stops at the last success with probability at least 1/e, provided the number of trials is non-zero, and this bound is the best possible [5,11,18].Recently, it was observed [10] that the 1/e-strategy is not optimal for the problem with trials occurring at times of a linear birth process.Here, we will cast the model of [10] in the context of the log-series prior, apply z-strategies, and show that the stopping problem does not belong to the monotone case.

The probability model
Let π be a power series distribution (1) with weights The associated mixed binomial process on the unit interval is an orderly counting process (N t , t ∈ [0, 1]) with the uniform order statistic property.The process, can also be seen as a time inhomogeneous pure-birth process, with transition rate expressible through the generating function of (w n ), see [24].The posterior distribution of the number of trials yet to occur is again a power series distribution with scale variable and a normalisation function f k (x).The conditioning relation (2) appears in many statistical problems related to censored data.In principle, instead of considering a family of processes (N t ) with parameter q, we could deal with one Markov process defined as function of the 'size' variable (3), where q > 0 assumes values within the range of convergence of n w n q n .We prefer not to adhere to this viewpoint, as the 'real time' parameter is more intuitive.Nevertheless, we will switch back and forth between t and x, as x is more suitable for power series work.
The arrivals are assumed to be uniformly distributed due to the nice self-similarity features.That is, conditionally on N t = k (i) The point processes of trials on [0, t) and (t, 1] are independent, (ii) N t+s/(1−t) − N t , s ∈ [0, 1] is a mixed binomial process on [0, 1], with the number of trials distributed according to (2).
Let p = (p k , k ≥ 1) be a profile of success probabilities.We assume that The trial at index k (k th trial) is a success with probability p k , independently of other trials and the pacing process.Thus, the point process of success epochs is obtained from (N t ) by thinning out the k th point with probability 1 − p k .Typically, the point process of success epochs is not Poisson, nor even Markovian.We denote (t, k) the state of the counting process, meaning the event N t = k, and write (t, k) • if the k th trial occurs at time t and it is a success.

The trapping game and stopping problem
A single round of the trapping game played in the generic state (t, k) is the following.The player chooses either a proper subset of the interval (t, 1], next or bygone.A z-strategy corresponds to the interval with endpoints t + z(1 − t) and 1.Such interval is called final, and the left endpoint is called cutoff.The gambler's objective is to choose an admissible action to maximise the probability of isolating the last success epoch from other successes.
For the trapping game, it is irrelevant whether the state is (t, k) or (t, k) • .The game in state (t, k) can be reduced to the game in state (0, 0), by assigning to (2) the role of a prior, and truncating the profile of success probabilities.In state (0, 0), the trap of z-strategy is just the final interval (z, 1]. A reason to consider the state (t, k) as a variable, is the connection with the following optimal stopping problem (as mentioned in the Introduction): Consider the increasing filtration of sigma-algebras induced by the natural information flow, so that the data available at time t comprises the location of trials on [0, t] and their outcomes.Let τ be an adapted stopping time, viewed as online strategy of the gambler.For a given succession of the trials, the range of τ is the set of success epochs or 1.The gambler wins a pound if the last success epoch is (τ, N τ ) • , otherwise there is no payoff.In particular, there is no payoff in the event τ = 1.The objective is to maximise the winning probability (equal to the expected payoff).
Stopping time is said to be Markovian if the decision in state (t, k) • only depends on the state, but not on the trials before time t, or the initial state.A trapping strategy can be seen as a randomised stopping time initiated in some state (t 0 , k 0 ) or (t 0 , k 0 ) • .It is non-Markovian because it depends on the initial state.In particular, for z-strategy the stopping condition involves t ≥ t 0 + z(1 − t 0 ).

The fixed-n game
The trapping game with fixed number of trials is not trivial itself.This can be seen as a game of informed gambler who learns the number of future trials at time of the decision.

Discrete time
Suppose the gambler in state (t, k) learns that there are j trials yet to occur, making the total to n = k + j.The number of successes in unseen trials k + 1, • • • , n has probability generating function The probability of zero successes is and the probability of exactly one success is There is an obvious recursion relationship between s 0 and s 1 which we can write as has at most one variation of sign, namely its sign pattern is It follows that: is unimodal with at most two (adjacent) maximum locations, (ii) The modes are non-decreasing in n.
For n fixed, we also have that: (iii) The mode is precisely the minimal location where bygone starts outperforming next.
A well known fact of the optimal stopping theory [8] is that A * is optimal, in the sense that no other set A ⊂ {1, • • • , n} isolates the last success in n trials with higher probability.The following is a direct variational proof: Clearly, n ∈ A is necessary for A to be optimal.By induction, suppose we have shown that {k

Monotonicity in n. We show next that the sign of max
is the same as the sign of p k * +1 − p n+1 .In particular, the winning probability is nonincreasing if p n ↓.To argue this point, we identify A * = {k * + 1, • • • , n} with a stopping strategy in discrete time.By increasing the number of trials by one, the mode may either remain the same or increase by one.
Firstly, compare how A * performs in the n-problem, with the stopping set B := {k * + 1, • • • , n + 1} applied in the problem with n + 1 trials.Clearly, strategies A * and B only differ if the (n + 1) st trial is a success, and the number of successes in trials k * + 1, • • • , n is either 1 or 0. Thus, the winning probabilities differ by Secondly, compare A * with the other option, C := {k * +2, • • • , n, n+1}.The difference of winning probabilities of A * in n-problem and C in (n + 1)-problem has four component probabilities: which has the same sign as p k * +1 − p n+1 because the first factor is non-negative by the optimality of A * .

Fixed n, trapping in continuous time
In the elementary continuous-time scenario, a fixed number n of trials occur at uniformly sampled locations on [0, 1].In state (t, k), the trapping probability for z-strategy is a Bernstein polynomial in z, Replacing s 1 by s 0 in this formula gives the probability denoted S 0 (k, n; z), that none of the successes gets trapped by the z-strategy, with S 0 (k, n; 0) equal to the probability to win with bygone.The dependence on t is void, since conditionally on k arrivals before t, there is a non-random number n − k of arrivals uniformly paced in (t, 1].Note that s 0 (k + 1, n) = S 0 (k, n; 0) and s 1 (k + 1, n) = S 1 (k, n; 0).The form of the optimal stopping strategy in the fixed-n discrete-time problem and the theorem about excluding randomised stopping times [13] imply that That is to say, trapping is ineffective if bygone is better than next.This holds for k, n satisfying Replacing a final interval by any other trap does not change the conclusion.
From the unimodality of s 1 (•, n) and the shape-preserving properties of the Bernstein polynomials (see [16], Theorem 3.3), it follows that ( 6) is unimodal.Therefore, a unique strategy exists which is optimal among the z-strategies.Mimicking the discrete-time variational argument, it will be shown in the following that other traps (Borel sets) cannot be better.
Theorem 1.The optimal trapping strategy on n trials is a z-strategy, where z is the unique mode of S 1 (k, n; •).The mode is 0 in the case (7), and otherwise z ∈ (0, 1).
Proof.To ease notation, we consider the state (0, 0), which is sufficient.There is certainly a final interval that belongs to the optimal trap, because near the end of the time interval the probability of two or more successes is negligible.Now, suppose [z, 1] belongs to the trap and we are assessing if the length element [z − dz, z] is worth including.The change of the winning probability due to the inclusion is some factor depending on the structure of the trap within [0, z − dz] multiplied by the following By (4), in the variable z/(1 − z) the polynomial (• • • ) has at most one variation of sign in the coefficients.Applying Descartes' rule of signs, we see that the polynomial has at most one positive root.This implies that the optimal trap is a final interval with cutoff coinciding with the root, or [0, 1] if there are no roots.
It remains to check that the root, if any, coincides with the mode of Indeed, we have for the derivative using ( 4) Which is the negative of the polynomial in (8).This gives the desired conclusion.

Examples
The best choice problem.In the random records model, the formula p k = 1/k for probability of record is a consequence of the exchangeability of ranks of the trials.The Bernstein polynomials satisfy and the convergence is uniform.The sequence of modes converges to 1/e.In the case k = 0, the Bernstein polynomial can be written in the form of a Taylor polynomial , which decreases pointwise to z → −z log z as n increases.As observed in [5], the modes increase monotonically to 1/e and also max z S 1 (0, n; z) ↓ 1/e.
These facts underline the minimax property of the stopping strategy with a single cutoff 1/e, known as the 1/e-strategy [5,7,12].See [18] for a recent analysis of strategic dominance of this and other minimax strategies.
For k > 0, the above nice monotonicity properties are no longer valid, the minimax value is below 1/e and the 1/e-cutoff strategy is not minimax.This is seen already in the case k = 1, where the Bernstein polynomials have alternative representations The first formula is derived by conditioning on the highest rank j of the trials that occur on [0, z].The Karamata-Stirling profile.The profile with parameter θ > 0, plays a central role in the combinatorial structures related to the Ewens sampling formula for random partitions [1].The term Karamata-Stirling law was coined in [2] for the distribution of the number of successes with these probabilities.The number of successes in trials k + 1, • • • , n has probability generating function As n → ∞, S 1 (0, n; z) → −θz θ log z and the modes converge to e −1/θ .The shapes vary considerably with θ.For θ large, the minimax trapping value is close to zero.

Random number of trials
The best choice discrete-time problem with random number of trials was pioneered in [23].
The following features can be readily extended to the general profiles p.The sequence of success epochs is a Markov chain on non-negative integers, and a stopping strategy can be identified with a set of integers.
In general, the optimal stopping set A * is not a gap-free integer interval, it is rather comprised of 'stopping islands' whose number and configuration depend on the prior.It is important to note that A * is a universal set, not depending on the initial state, in the sense that the 'trap' is A * ∩ {k + 1, • • • } for proceeding from position k.This is different from the problem with continuous time, where the optimal traps in different states (t, k) are not consistent, unless the point process of success epochs is Poisson.

Tests for the monotone case of optimal stopping
We proceed with the continuous time setting, assuming p and π given.In state (t, k), the probability to isolate the last success with z-strategy is a convex mixture of Bernstein polynomials The z = 0 instance is the probability to win with next and S 1 (t, k; 1) = 0. Similarly, the probability that none of the successes is trapped by the z-strategy is and S 0 (t, k; 0) is the probability to win with bygone.
Using (2) and (3), we can cast the winning probabilities as where Thus, Q k (x) = R k (x, 0).We are looking next at some critical 'cutoffs' for the trapping game and optimal stopping.Lemma 2. Equation P k (x) = Q k (x) has at most one root α k > 0, for every k ≥ 1.
Proof.The series P k (x) − Q k (x) has at most one change of sign from + to −, hence Descartes' rule of signs for power series [15] entails that there is at most one root.
We set α k = ∞ if the root does not exist.Define the cutoff This is the earliest time when bygone becomes as beneficial as next.The myopic stopping strategy starting at time t 0 is defined as Keep in mind that if the sequence (α k ) is monotone, then (a k ) is also monotone but with the monotonicity direction reversed.The monotone case of optimal stopping and optimality of the myopic strategy hold if a k ↓.
Proof.We follow the argument in Lemma 2. The derivative at z = 0 is This has at most one change of sign as x ≥ 0 varies, and then from + to −.Furthermore, This follows by comparing the series and noting that the weights at positive terms in D z are higher.
If there is no finite root, we set , and b k ≥ a k+1 by Lemma 3. Thus, b k is the earliest time when the action nextat index k cannot be improved by a z-strategy with small z.
To summarise the above: • For t < a k : next is better than bygone.
• For t < b k : a trapping strategy is better than next.
Theorem 4. The optimal stopping problem belongs to the monotone case (for every q and arbitrary initial state) if and only if In that case we have the interlacing pattern of roots Proof.We argue in probabilistic terms.The bivariate sequence of success epochs (t, k) • is an increasing Markov chain.The monotone case of optimal stopping occurs iff the set of states where bygone outperforms next is closed, which holds iff this is an upper subset with respect to the partial order in [0, 1] × {1, 2, • • • }.The latter property amounts to the monotonicity condition α k ↑.By Lemma 3, the inequality α k ≤ β k+1 always hold.In the monotone case, if in some state (t, k) • the actions bygone and next are equally good, then trapping cannot improve upon these by optimality of the myopic strategy.In the analytic terms, the above translates as the inequality The monotone case does not hold if α k+1 < α k for some k.Alternatively, one can use β k < α k as a test.Indeed, If β k < α k and a k = 0 then in state (a k , k) • a trapping strategy is better than both bygone and next.

Unimodality and concavity
Being a convex mixture of unimodal functions, S 1 (t, k; •) itself need not be unimodal.Accordingly, the optimal trap may not be a final interval.Concavity is a simple condition to ensure the unimodality of S 1 (t, k; •).
Suppose s 1 (•, n) is concave for every n ≥ 1, that is, the second difference in the first variable is non-positive.By the shape-preserving properties of Bernstein polynomials, the internal sum in (10) is a concave function of z, hence the mixture S 1 (t, k; •) is also concave.In that case we have Theorem 5.If s 1 (•, n) is concave for every n, then for cutoff z coinciding with the mode of S 1 (t, k; •), the z-strategy is optimal among all trapping strategies.The mode is distinct from 0 iff t < b k .
Proof.The overall optimality follows from the unimodality as in Theorem 1.By concavity, the mode is zero if D z (t, k, π, 0) ≤ 0, and is positive otherwise.
The concavity is easy to express in terms of p explicitly.For instance, consider the second difference for k = 1.The second difference in the variable k of the probability generating function Computing D λ at λ = 0 yields the second difference of s 1 (• , n) Hence, a sufficient condition for the concavity of We stress that (13) ensures unimodality for arbitrary π and only involves two consequitive success probabilities.The price to pay for the generality is that the condition is restrictive, as seen on Figure 3.
For the Karamata-Stirling profile, straight calculation shows that ( 12) is non-positive, hence This is a narrow range, but it includes two most important cases θ = 1 and θ = 1/2.

The best choice problem under the log-series prior
In this section, we consider the classic profile p k = 1/k from the random records model and the logarithmic series prior (so π 0 = 0), where c(q) = | log(1 − q)| −1 .Two representations of such π as mixed Poisson distribution can be obtained by mixing before zero-truncating or after [20].From a wider view, the setting is the (ν = 0, θ = 1) instance of the problem with negative binomial prior NB(ν, q) and the Karamata-Stirling profile as studied in [19].Similarly to the case 0 < ν < 1, comparison with the geometric prior yields This entails that the roots sequence (α k ) cannot be decreasing, hence by Theorem 4 the monotone case of optimal stopping does not hold.As was shown in Section 4.2 for this profile, the best trapping strategy is a z-strategy.
Let T 1 be the time of the first trial.Lemma 6.Under the logarithmic series prior (14) (i) The time of the first trial T 1 has probability density function ) is a Pólya-Lundberg birth process with transition rates In particular, conditionally on T 1 = t 1 , the posterior distribution is geometric with the 'failure' probability (1 − t 1 )q.
Proof.Assertion (i) follows from and (iii) from the identity k + j j The value q = 1 is on the edge of convergence.It formally corresponds to the infinite 'non-informative' prior.As a result of that, the Pólya-Lundberg process is well defined by the rates in (ii) for any initial state (t 0 , k 0 ) with t 0 > 0. With initial state (t 0 , 0), the model is equivalent to the model with logarithmic prior NB(0, (1 − t 0 )) and trials occurring on [0, 1].In the t 0 → 0 limit, the process of record times becomes a Poisson process with intensity function t −1 with the 1/e-strategy being then optimal.

Hypergeometrics
It will be helpful to recall some properties of the Gaussian hypergeometric function These include: the differentiation formula the transformation formula and Euler's integral representation for c > b > 0 The probability generating function for the number of successes following state (t, k), for k ≥ 1, is given by a hypergeometric function We read off that the normalisation function is Expanding at λ = 0 we identify two basic power series as where as before x = (1 − t)q ∈ [0, 1] and D a is the derivative in the first parameter.The differentiation formula implies backward recursions Applying the transformation formula yields , hence, we may write the winning probability with bygone as the series It is readily seen that as k increases, this function decreases to 1 − x.The fact was shown in [9] probabilistically.Convergence to 1 − x is related to the fact that for large k the process of record times approaches a Poisson process.Explicitly, for k = 1, 2 and L := − log(1 − x), computing the roots to six decimal places we have , α 2 = 0.755984 Proposition 7. The roots satisfy α k ↓ 1 − 1/e as k → ∞.
Proof.In the case of constant weights w j = 1, the prior is geometric and all roots coincide with 1 − 1/e.The log-series distribution weights satisfy w n+1 /w n ↑ 1, hence, comparison with the geometric distribution (see [19]) gives α k > 1 − 1/e and α k → 1 − 1/e.That the sequence of roots is decreasing will be shown separately.
Corollary 8.The optimal stopping problem is not monotone, the myopic strategy τ * is not optimal, and (i) for q > 1 − 1/e the myopic strategy is determined by an infinite sequence of cutoffs converging to 1 − (1 − 1/e)/q.
(ii) for t ≥ (1 − (1 − 1/e)/q) + , bygone is the optimal action for every k, (iii) for times as in (ii) the optimal stopping strategy stops greedily at the first available record.
With some manipulation, we can derive an integral formula for R(x, z).Consider first k ≥ 1.The probability generating function of the number of record epochs following (t, k) and falling in the final interval [t + z(1 − t), 1] has probability generating function Differentiating at λ = 0 yields for x = (1 − t)q, z ∈ [0, 1] (16) For k = 0, a similar calculation with log-series weights NB(0, x) gives for x = (1 − t)q

Monotonicity of cutoffs for the myopic strategy
We show next that the roots are indeed decreasing, which is the direction opposite to the one needed for optimality of the myopic strategy.We may define the root α k in terms of the quotient, as a unique solution on [0, 1) to As x runs from 0 to 1, the quotient varies from 0 to ∞. Euler's integral for the hypergeometric function specialises as Expanding at λ = 0 gives Lemma 9.The logarithmic derivative (17) Proof.From ( 18) By the same argument, a similar formula is obtained for Q k+1 (x)P k (x).Splitting the integration domain, then swapping the variables on the triangle above the diagonal y > z yields because the symmetric part of the integrand is positive and the asymmetric is negative for x ∈ [0, 1).

The information bounds
Suppose that in state (t, k) the gambler learns that there are exactly j trials yet to occur.A higher winning probability is attainable with more information and it is one of: for the best trapping, (iv) max k :k >k s 1 (k + 1, k + j) for the optimal stopping strategy, now independent of the time of trials.
Weighting these with the posterior distribution π(j | t, k) gives upper bounds I k , on the winning probability, only achievable by the informed gambler.

The value function
Define v(t, k) to be the continuation value of state (t, k), equal to the winning probability achieved by the optimal stopping strategy starting in this state.By the optimality principle, in state (t, k) • it is optimal to stop (action bygon) iff S 0 (t, k; 0) ≥ v(t, k).We have v(1, k) = 0 for k ≥ 1 and v(1, 0) = 1 since, near the end of the time interval, it is unlikely to see more trials if some have occurred, but at least one trial is ensured by the log-series prior if none occurred.Passing to x = (1 − t)q we can write the continuation value as a function The optimality principle yields a recursion for the V k 's as follows: Given N t = k, let T k+1 be the next trial epoch or 1 in the event N 1 = k.Similarly to the argument in Lemma 6, it is seen that the random variable (1 − T k+1 )/(1 − t) has density: By the (k + 1) st trial, the optimal stopping strategy chooses a better action in case the trial is a success, hence integrating out T k+1 we obtain This has the following differential form for k ≥ 1 The instance k = 0 is special.Integrating out the variable T 1 gives: or, in the differential form with initial conditions V 0 (0) = 1 and V k (0) = 0 for k ≥ 1: By the theory of optimal stopping [13], the value function can be characterised as the minimal solution to (19), (20).For computational purposes, one can use the limit relation On the left part of the interval we know the value function exactly as a consequence of Corollary 8.As a check, for k ≥ 1 let With this change of variable, (19) simplifies to For x in the range where P k+1 (x) − V k+1 (x) ≥ 0, this becomes the recursion (15).
To solve ( 19), ( 20  Since γ k < α k , we have γ k → 1 − 1/e.It is natural to expect that γ k 's are decreasing, and that the optimal stopping strategy is determined by the cutoffs (1 − γ k /q) + , in complete analogy with the myopic strategy τ * .This is confirmed by simulation which also shows that the myopic and optimal strategies are very close to one another, as is evident by comparing the critical points in Table 1.We remind that these are related to real-time cutoffs via (3).
While V k (x) → 1/e when either x → 1 or k → ∞, the simulation shows that the functions increase in x and k ≥ 1.In contrast to the above, V 0 is decreasing with V 0 (x) ↓ 1/e as x ↑ 1, see Figure 5.The latter convergence is slow, because the logarithmic distribution of N 1 |N t = 0 puts a relatively high weight on small values of n, which is advantageous for stopping at the last record.For instance, for q = 1 − 10 −6 the mean is about 72382 while the probability of only one trial is still higher than 0.072382.α k : critical points for the myopic strategy β k : balance points where next is as good as trapping γ k : critical points for the optimal strategy δ k : lower bounds of γ k obtained from the information bound ρ k : balance points where bygone is as good as trapping

Figure 4 :
Figure 4: Bounds on the optimal strategy I k (x)

Figure 5 :
Figure 5: Stop and Continuation Values