Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics

Kammerer, Niels B.; Stummer, Wolfgang

doi:10.3390/e22080874

Open AccessArticle

Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics

by

Niels B. Kammerer

¹ and

Wolfgang Stummer

^2,*

¹

Königinstrasse 75, 80539 Munich, Germany

²

Department of Mathematics, University of Erlangen–Nürnberg, Cauerstrasse 11, 91058 Erlangen, Germany

^*

Author to whom correspondence should be addressed.

Entropy 2020, 22(8), 874; https://doi.org/10.3390/e22080874

Submission received: 26 June 2020 / Revised: 27 July 2020 / Accepted: 28 July 2020 / Published: 8 August 2020

(This article belongs to the Special Issue Robust Procedures for Estimating and Testing in the Framework of Divergence Measures)

Download

Browse Figures

Versions Notes

Abstract

:

We compute exact values respectively bounds of dissimilarity/distinguishability measures–in the sense of the Kullback-Leibler information distance (relative entropy) and some transforms of more general power divergences and Renyi divergences–between two competing discrete-time Galton-Watson branching processes with immigration GWI for which the offspring as well as the immigration (importation) is arbitrarily Poisson-distributed; especially, we allow for arbitrary type of extinction-concerning criticality and thus for non-stationarity. We apply this to optimal decision making in the context of the spread of potentially pandemic infectious diseases (such as e.g., the current COVID-19 pandemic), e.g., covering different levels of dangerousness and different kinds of intervention/mitigation strategies. Asymptotic distinguishability behaviour and diffusion limits are investigated, too.

Keywords:

Galton-Watson branching processes with immigration; Hellinger integrals; power divergences; Kullback-Leibler information distance/divergence; relative entropy; Renyi divergences; epidemiology; COVID-19 pandemic; Bayesian decision making; INARCH(1) model; GLM model; Bhattacharyya coefficient/distance

Contents

1 Introduction	3
2 The Framework and Application Setups	5
2.1 Process Setup	5
2.2 Connections to Time Series of Counts	6
2.3 Applicability to Epidemiology	8
2.4 Information Measures	12
2.5 Decision Making under Uncertainty	15
2.6 Asymptotical Distinguishability	19
3 Detailed Recursive Analyses of Hellinger Integrals	21
3.1 A First Basic Result	21
3.2 Some Useful Facts for Deeper Analyses	25
3.3 Detailed Analyses of the Exact Recursive Values, i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI} \cup P_{SP, 1}$	27
3.4 Some Preparatory Basic Facts for the Remaining Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$	29
3.5 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	31
3.6 Goals for Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	32
3.7 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times] 0, 1 [$	34
3.8 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times] 0, 1 [$	35
3.9 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times] 0, 1 [$	36
3.10 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times] 0, 1 [$	37
3.11 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [$	37
3.12 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [$	37
3.13 Concluding Remarks on Alternative Upper Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in$ $(P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	37
3.14 Intermezzo 1: Application to Asymptotical Distinguishability	38
3.15 Intermezzo 2: Application to Decision Making under Uncertainty	39
3.15.1 Bayesian Decision Making	39
3.15.2. Neyman-Pearson Testing	41
3.16 Goals for Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	41
3.17 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times (R \ [0, 1])$	44
3.18 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times (R \ [0, 1])$	45
3.19 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times (R \ [0, 1])$	46
3.20 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times (R \ [0, 1])$	47
3.21 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times (R \ [0, 1])$	47
3.22 Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times (R \ [0, 1])$	48
3.23 Concluding Remarks on Alternative Lower Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	48
3.24 Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	48
4 Power Divergences of Non-Kullback-Leibler-Information-Divergence Type	49
4.1 A First Basic Result	49
4.2 Detailed Analyses of the Exact Recursive Values of $I_{λ} (\cdot ∥ \cdot)$ , i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$	51
4.3 Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	52
4.4 Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	53
4.5 Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	53
4.6 Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	54
4.7 Applications to Bayesian Decision Making	55
5 Kullback-Leibler Information Divergence (Relative Entropy)	55
5.1 Exact Values Respectively Upper Bounds of $I (\cdot \| \| \cdot)$	55
5.2 Lower Bounds of $I (\cdot \| \| \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{SP} \ P_{SP, 1})$	56
5.3 Applications to Bayesian Decision Making	58
6 Explicit Closed-Form Bounds of Hellinger Integrals	59
6.1 Principal Approach	59
6.2 Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$	63
6.3 Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$	64
6.4 Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$	67
6.5 Totally Explicit Closed-Form Bounds	69
6.6 Closed-Form Bounds for Power Divergences of Non-Kullback-Leibler-Information-Divergence Type	70
6.7 Applications to Decision Making	71
7 Hellinger Integrals and Power Divergences of Galton-Watson Type Diffusion Approximations	71
7.1 Branching-Type Diffusion Approximations	71
7.2 Bounds of Hellinger Integrals for Diffusion Approximations	74
7.3 Bounds of Power Divergences for Diffusion Approximations	79
7.4 Applications to Decision Making	80
A Proofs and Auxiliary Lemmas	81
A.1. Proofs and Auxiliary Lemmas for Section 3	81
A.2 Proofs and Auxiliary Lemmas for Section 5	88
A.3 Proofs and Auxiliary Lemmas for Section 6	94
A.4 Proofs and Auxiliary Lemmas for Section 7	101
References	115

1. Introduction

(This paper is a thoroughly revised, extended and retitled version of the preprint arXiv:1005.3758v1 of both authors) Over the past twenty years, density-based divergences

D (P, Q)

–also known as (dis)similarity measures, directed distances, disparities, distinguishability measures, proximity measures–between probability distributions P and Q, have turned out to be of substantial importance for decisive statistical tasks such as parameter estimation, testing for goodness-of-fit, Bayesian decision procedures, change-point detection, clustering, as well as for other research fields such as information theory, artificial intelligence, machine learning, signal processing (including image and speech processing), pattern recognition, econometrics, and statistical physics. For some comprehensive overviews on the divergence approach to statistics and probability, the reader is referred to the insightful books of e.g., Liese & Vajda [1], Read & Cressie [2], Vajda [3], Csiszár & Shields [4], Stummer [5], Pardo [6], Liese & Miescke [7], Basu et al. [8], Voinov et al. [9], the survey articles of e.g., Liese & Vajda [10], Vajda & van der Meulen [11], the structure-building papers of Stummer & Vajda [12], Kißlinger & Stummer [13] and Broniatowski & Stummer [14], and the references therein. Divergence-based bounds of minimal mean decision risks (e.g., Bayes risks in finance) can be found e.g., in Stummer & Vajda [15] and Stummer & Lao [16].

Amongst the above-mentioned dissimilarity measures, an important omnipresent subclass are the so-called

f -

divergences of Csiszar [17], Ali & Silvey [18] and Morimoto [19]; important special cases thereof are the total variation distance and the very frequently used

λ -

order power divergences

I_{λ} (P, Q)

(also known as alpha-entropies, Cressie-Read measures, Tsallis cross-entropies) with

λ \in R

. The latter cover e.g., the very prominent Kullback-Leibler information divergence

I_{1} (P, Q)

(also called relative entropy), the (squared) Hellinger distance

I_{1 / 2} (P, Q)

, as well as the Pearson chi-square divergence

I_{2} (P, Q)

. It is well known that the power divergences can be build with the help of the

λ -

order Hellinger integrals

H_{λ} (P, Q)

(where e.g., the case

λ = 1 / 2

corresponds to the well-known Bhattacharyya coefficient), which are information measures of interest by their own and which are also the crucial ingredients of

λ -

order Renyi divergences

R_{λ} (P, Q)

(see e.g., Liese & Vajda [1], van Erven & Harremoes [20]); the case

R_{1 / 2} (P, Q)

corresponds to the well-known Bhattacharyya distance.

The above-mentioned information/dissimilarity measures have been also investigated in non-static, time-dynamic frameworks such as for various different contexts of stochastic processes like processes with independent increments (see e.g., Newman [21], Liese [22], Memin & Shiryaev [23], Jacod & Shiryaev [24], Liese & Vajda [1], Linkov & Shevlyakov [25]), Poisson point processes (see e.g., Liese [26], Jacod & Shiryaev [24], Liese & Vajda [1]), diffusion prcoesses and solutions of stochastic differential equations with continuous paths (see e.g., Kabanov et al. [27], Liese [28], Jacod & Shiryaev [24], Liese & Vajda [1], Vajda [29], Stummer [30,31,32], Stummer & Vajda [15]), and generalized binomial processes (see e.g., Stummer & Lao [16]); further related literature can be found e.g., in references of the aforementioned papers and books.

Another important class of time-dynamic models is given by discrete-time integer-valued branching processes, in particular (Bienaymé-)Galton-Watson processes without immigration GW respectively with immigration (resp. importation, invasion) GWI, which have numerous applications in biotechnology, population genetics, internet traffic research, clinical trials, asset price modelling, derivative pricing, and many others. As far as important terminology is concerned, we abbreviatingly subsume both models as GW(I) and, simply as GWI in case that GW appears as a parameter-special-case of GWI; recall that a GW(I) is called subcritical respectively critical respectively supercritical if its offspring mean is less than 1 respectively equal to 1 respectively larger than 1.

For applications of GW(I) in epidemiology, see e.g., the works of Bartoszynski [33], Ludwig [34], Becker [35,36], Metz [37], Heyde [38], von Bahr & Martin-Löf [39], Ball [40], Jacob [41], Barbour & Reinert [42], Section 1.2 of Britton & Pardoux [43]); for more details see Section 2.3 below.

For connections of GW(I) to time series of counts including GLM models, see e.g., Dion, Gauthier & Latour [44], Grunwald et al. [45], Kedem & Fokianos [46], Held, Höhle & Hofmann [47], and Weiß [48]; a more comprehensive discussion can be found in Section 2.2 below.

As far as the combined study of information measures and GW processes is concerned, let us first mention that (transforms of) power divergences have been used for supercritical Galton-Watson processes without immigration for instance as follows: Feigin & Passy [49] study the problem to find an offspring distribution which is closest (in terms of relative entropy type distance) to the original offspring distribution and under which ultimate extinction is certain. Furthermore, Mordecki [50] gives an equivalent characterization for the stable convergence of the corresponding log-likelihood process to a mixed Gaussian limit, in terms of conditions on Hellinger integrals of the involved offspring laws. Moreover, Sriram & Vidyashankar [51] study the properties of offspring-distribution-parameters which minimize the squared Hellinger distance between the model offspring distribution and the corresponding non-parametric maximum likelihood estimator of Guttorp [52]. For the setup of GWI with Poisson offspring and nonstochastic immigration of constant value 1, Linkov & Lunyova [53] investigate the asymptotics of Hellinger integrals in order to deduce large deviation assertions in hypotheses testing problems.

In contrast to the above-mentioned contexts, this paper pursues the following main goals:

(MG1): for any time horizon and any criticality scenario (allowing for non-stationarities), to compute lower and upper bounds–and sometimes even exact values–of the Hellinger integrals $H_{λ} (P_{A} ∥ P_{H})$ , power divergences $I_{λ} (P_{A} ∥ P_{H})$ and Renyi divergences $R_{λ} (P_{A} ∥ P_{H})$ of two alternative Galton-Watson branching processes $P_{A}$ and $P_{H}$ (on path/scenario space), where (i) $P_{A}$ has Poisson( $β_{A}$ ) distributed offspring as well as Poisson( $α_{A}$ ) distributed immigration, and (ii) $P_{H}$ has Poisson( $β_{H}$ ) distributed offspring as well as Poisson( $α_{H}$ ) distributed immigration; the non-immigration cases are covered as $α_{A} = α_{H} = 0$ ; as a side effect, we also aim for corresponding asymptotic distinguishability results;
(MG2): to compute the corresponding limit quantities for the context in which (a proper rescaling of) the two alternative Galton-Watson processes with immigration converge to Feller-type branching diffusion processes, as the time-lags between the generation-size observations tend to zero;
(MG3): as an exemplary field of application, to indicate how to use the results of (MG1) for Bayesian decision making in the epidemiological context of an infectious-disease pandemic (e.g., the current COVID-19), where e.g., potential state-budgetary losses can be controlled by alternative public policies (such as e.g., different degrees of lockdown) for mitigations of the time-evolution of the number of infectious persons (being quantified by a GW(I)). Corresponding Neyman-Pearson testing will be treated, too.

Because of the involved Poisson distributions, these goals can be tackled with a high degree of tractability, which is worked out in detail with the following structure (see also the full table of contents after this paragraph): in Section 2, we first introduce (i) the basic ingredients of Galton-Watson processes together with their interpretations in the above-mentioned pandemic setup where it is essential to study all types of criticality (being connected with levels of reproduction numbers), (ii) the employed fundamental information measures such as Hellinger integrals, power divergences and Renyi divergences, (iii) the underlying decision-making framework, as well as (iv) connections to time series of counts and asymptotical distinguishability. Thereafter, we start our detailed technical analyses by giving recursive exact values respectively recursive bounds–as well as their applications–of Hellinger integrals

H_{λ} (P_{A} ∥ P_{H})

(see Section 3), power divergences

I_{λ} (P_{A} ∥ P_{H})

and Renyi divergences

R_{λ} (P_{A} ∥ P_{H})

(see Section 4 and Section 5). Explicit closed-form bounds of Hellinger integrals

H_{λ} (P_{A} ∥ P_{H})

will be worked out in Section 6, whereas Section 7 deals with Hellinger integrals and power divergences of the above-mentioned Galton-Watson type diffusion approximations.

2. The Framework and Application Setups

2.1. Process Setup

We investigate dissimilarity measures and apply them to decisions, in the following context. Let the integer-valued random variable

X_{n}

(

n \in N_{0}

) denote the size of the nth generation of a population (of persons, organisms, spreading news, other kind of objects, etc.) with specified characteristics, and suppose that for the modelling of the time-evolution

n \mapsto X_{n}

we have the choice between the following two (e.g., alternative, competing) models

(H)

and

(A)

:

(H)

a discrete-time homogeneous Galton-Watson process with immigration GWI, given by the recursive description

X_{0} \in N; N_{0} ∋ X_{n} = \sum_{k = 1}^{X_{n - 1}} Y_{n - 1, k} + {\tilde{Y}}_{n}, n \in N,

(1)

where

Y_{n - 1, k}

is the number of offspring of the kth object (e.g., organism, person) within the

(n - 1)

th generation, and

{\tilde{Y}}_{n}

denotes the number of immigrating objects in the nth generation. Notice that we employ an arbitrary deterministic (i.e., degenerate random) initial generation size

X_{0}

. We always assume that under the corresponding dynamics-governing law

P_{H}

(GWI1): the collection $Y : = \{Y_{n - 1, k}, n \in N, k \in N\}$ consists of independent and identically distributed (i.i.d.) random variables which are Poisson distributed with parameter $β_{H} > 0$ ,
(GWI2): the collection $\tilde{Y} : = \{{\tilde{Y}}_{n}, n \in N\}$ consists of i.i.d. random variables which are Poisson distributed with parameter $α_{H} \geq 0$ (where $α_{H} = 0$ stands for the degenerate case of having no immigration),
(GWI3): Y and $\tilde{Y}$ are independent.

(A)

a discrete-time homogeneous Galton-Watson process with immigration GWI given by the same recursive description (1), but with different dynamics-governing law

P_{A}

under which (GWI1) holds with parameter

β_{A} > 0

(instead of

β_{H} > 0

), (GWI2) holds with

α_{A} \geq 0

(instead of

α_{H} \geq 0

), and (GWI3) holds. As a side remark, in some contexts the two models

(H)

and

(A)

may function as a “sandwich” of a more complicated not fully known model.

Basic and advanced facts on general GWI (introduced by Heathcote [54]) can be found e.g., in the monographs of Athreya & Ney [55], Jagers [56], Asmussen & Hering [57], Haccou [58]; see also e.g., Heyde & Seneta [59], Basawa & Rao [60], Basawa & Scott [61], Sankaranarayanan [62], Wei & Winnicki [63], Winnicki [64], Guttorp [52] as well as Yanev [65] (and also the references therein all those) for adjacent fundamental statistical issues including the involved technical and conceptual challenges.

For the sake of brevity, wherever we introduce or discuss corresponding quantities simultaneously for both models

H

and

A

, we will use the subscript • as a synonym for either the symbol

H

or

A

. For illustration, recall the well-known fact that the corresponding conditional probabilities

P_{•} (X_{n} = \cdot | X_{n - 1} = k)

are again Poisson-distributed, with parameter

β_{•} \cdot k + α_{•}

.

In oder to achieve a transparently representable structure of our results, we subsume the involved parameters as follows:

(PS1): $P_{SP}$ is the set of all constellations $(β_{A}, β_{H}, α_{A}, α_{H})$ of real-valued parameters $β_{A} > 0$ , $β_{H} > 0$ , $α_{A} > 0$ , $α_{H} > 0$ , such that $β_{A} \neq β_{H}$ or $α_{A} \neq α_{H}$ (or both); in other words, both models are non-identical and have non-vanishing immigration;
(PS2): $P_{NI}$ is the set of all $(β_{A}, β_{H}, α_{A}, α_{H})$ of real-valued parameters $β_{A} > 0$ , $β_{H} > 0$ , $α_{A} = α_{H} = 0$ , such that $β_{A} \neq β_{H}$ ; this corresponds to the important special case that both models have no immigration and are non-identical;
(PS3): the resulting disjoint union will be denoted by $P = P_{SP} \cup P_{NI}$ .

Notice that for (unbridgeable) technical reasons, we do not allow for “crossovers” between “immigration and no-immigration” (i.e.,

α_{A} = 0

and

α_{H} \neq 0

, respectively,

α_{A} \neq 0

and

α_{H} = 0

). For practice, this is not a strong restriction, since one may take e.g.,

α_{A} = 10^{- 12}

and

α_{H} = 1

.

For the non-immigration case

α_{•} = 0

one has the following extinction properties (see e.g., Harris [66], Athreya & Ney [55]). As usual, let us define the extinction time

τ : = min {i \in N : X_{ℓ} = 0

for all integers

ℓ \geq i}

if this minimum exists, and

τ : = \infty

else. Correspondingly, let

B : = {τ < \infty}

be the extinction set. If the offspring mean

β_{•}

satisfies

β_{•} < 1

—which is called the subcritical case– or

β_{•} = 1

—which is known as the critical case–then extinction is certain, i.e., there holds

P (B | X_{0} = 1) = 1

. However, if the offspring mean satisfies

β_{•} > 1

—which is called the supercritical case–then there is a probability greater than zero, that the population never dies out, i.e.,

P (B | X_{0} = 1) \in] 0, 1 [

. In the latter case,

X_{n}

explodes (a.s.) to infinity as

n \to \infty

.

In contrast, for the (nondegenerate, nonvanishing) immigration case

α_{•} \neq 0

there is no extinction, viz.

P (B | X_{0} = 1) = 0

, although there may be zero population

X_{ℓ_{0}} = 0

for some intermediate time

ℓ_{0} \in N

; but due to the immigration, with probability one there is always a later time

ℓ_{1} > ℓ_{0}

, such that

X_{ℓ_{1}} > 0

. Nevertheless, also for the setup

α_{•} \neq 0

it is important to know whether

β_{•} ⪌ 1

—which is still called (super-, sub-)criticality–since e.g., in the case

β_{•} < 1

the population size

X_{n}

converges (as

n \to \infty

) to a stationary distribution on

N

whereas for

β_{•} > 1

the behaviour is non-stationary (non-ergodic), see e.g., Athreya & Ney [55].

At this point, let us emphasize that in our investigations (both for

α_{•} = 0

and for

α_{•} \neq 0

) we do allow for “crossovers” between “different criticalities”, i.e., we deal with all cases

β_{A} ⪌ 1

versus all cases

β_{H} ⪌ 1

; as will be explained in the following, this unifying flexibility is especially important for corresponding epidemiological-model comparisons (e.g., for the sake of decision making).

One of our main goals is to quantitatively compare (the time-evolution of) two competing GWI models

H

and

A

with respective parameter sets

(β_{H}, α_{H})

and

(β_{A}, α_{A})

, in terms of the information measures

H_{λ} (P_{A} ∥ P_{H})

(Hellinger intergrals),

I_{λ} (P_{A} ∥ P_{H})

(power divergences),

R_{λ} (P_{A} ∥ P_{H})

(Renyi divergences). The latter two express a distance (degree of dissimilarity) between

H

and

A

. From this, we shall particularly derive applications for decision making under uncertainty (including tests).

2.2. Connections to Time Series of Counts

It is well known that a Galton-Watson process with Poisson offspring (with parameter

β_{•}

) and Poisson immigration (with parameter

α_{•}

) is “distributionally” equal to each of the following models (listed in “tree-type” chronological order):

(M1)

a Poissonian Generalized Integer-valued Autoregressive process GINAR(1) in the sense of Gauthier & Latour [67] (see also Dion, Gauthier & Latour [44], Latour [68], as well as Grunwald et al. [45]), that is, a first-order autoregressive times series with Poissonian thinning (with parameter

β_{•}

) and Poissonian innovations (with parameter

α_{•}

);

(M2)

Poissonian first order Conditional Linear Autoregressive model (Poissonian CLAR(1)) in the sense of Grunwald et al. [45] (and earlier preprints thereof) (since the conditional expectation is

E P_{•} [X_{n} | F_{n - 1}] = α_{•} + β_{•} \cdot X_{n - 1}

); this can be equally seen as Poissonian autoregressive Generalized Linear Model GLM with identity link function (cf. [45] as well as Chapter 4 of Kedem & Fokianos [46]), that is, an autoregressive GLM with Poisson distribution as random component and the identity link as systematic component;

the same model was used (and generalized)

(M2i): under the name BIN(1) by Rydberg & Shephard [69] for the description of the number $X_{n}$ of stock transactions/trades recorded up to time n;
(M2ii): under the name Poisson autoregressive model PAR(1) by Brandt & Williams [70] for the description of event counts in political and other social science applications;
(M2iii): under the name Autoregressive Conditional Poisson model ACP(1,0) by Heinen [71];
(M2iv): by Held, Höhle & Hofmann [47] as well as Held et al. [72], as a description of the time-evolution of counts from infectious disease surveillance databases, where $β_{•}$ (respectively, $α_{•}$ ) is interpreted as driving parameter of epidemic (respectively, endemic) component; in principle, this type of modelling can be also implicitly recovered as a special case of the epidemics-treating work of Finkenstädt, Bjornstad & Grenfell [73], by assuming trend- and season-neglecting (e.g., intra-year) measles data in urban areas of about 10 million people (provided that their population size approximation extends linearly);
(M2v): under the name integer-valued Generalized Autoregressive Conditional Heteroscedastic model INGARCH(1,0) by Ferland, Latour & Oraichi [74] (since the conditional variance is $V a r P_{•} [X_{n} | F_{n - 1}] = α_{•} + β_{•} \cdot X_{n - 1}$ ), see also Weiß [75]; this has been refinely named as INARCH(1) model by Weiß [76,77], and frequently applied thereafter; for an “overlapping-generation type” interpretation of the INARCH(1) model, which is an adequate description for the time-evolution of overdispersed counts with an autoregressive serial dependence structure, see Weiß & Testik [78]; for a corresponding comprehensive recent survey (also to more general count time series), the reader is referred to the book of Weiß [48];

Moreover, according to the general considerations of Grunwald et al. [45], the Poissonian Galton-Watson model with immigration may possibly be “distributionally equal” to an integer-valued autoregressive model with random coefficient (thinning).

Nowadays, besides the name homogeneous Galton-Watson model with immigration GWI, the name INARCH(1) seems to be the most used one, and we follow this terminology (with emphasis on GWI). Typical features of the above-mentioned models (M1) to (M2v), are the use of

Z

as the set of times, and the assumptions

α_{•} > 0

as well as

β_{•} \in] 0, 1 [

, which guarantee stationarity and ergodicity (see above). In contrast, we employ

N_{0}

as the set of times, degenerate (and thus, non-equilibrium) starting distribution, and arbitrary

α_{•} \geq 0

as well as

β_{•} > 0

. For such a situation, as explained above, we quantitatively compare two competing GWI models

H

and

A

with respective parameter sets

(β_{H}, α_{H})

and

(β_{A}, α_{A})

. Since–as can be seen e.g., in (29) below—we basically employ only (conditionally) distributional ingredients, such as the corresponding likelihood ratio (see e.g., (13) to (15), (27) to (29) below), all the results of the Section 3, Section 4, Section 5 and Section 6 can be immediately carried over to the above-mentioned time-series contexts (where we even allow for non-stationarities, in fact we start with a one-point/Dirac distribution); for the sake of brevity, in the rest of the paper this will not be mentioned explicitly anymore.

Notice that a Poissonian GWI as well as all models (M1) and (M2) are–despite of their conditional Poisson law– typically overdispersed since

E P_{•} [X_{n}] = α_{•} + β_{•} \cdot E P_{•} [X_{n - 1}] \leq α_{•} + β_{•} \cdot E P_{•} [X_{n - 1}] + β_{•}^{2} \cdot V a r P_{•} [X_{n - 1}] = V a r P_{•} [X_{n}], n \in N \ {1},

with equality iff (i.e., if and only if)

α_{•} = 0

(NI) and

X_{n - 2} = 0

(extinction at

n - 2

with

n \geq 3

).

2.3. Applicability to Epidemiology

The above-mentioned framework can be used for any of the numerous fields of applications of discrete-time branching processes, and of the closely related INARCH(1) models. For the sake of brevity, we explain this—as a kind of running-example—in detail for the currently highly important context of the epidemiology of infectious diseases. For insightful non-mathematical introductions to the latter, see e.g., Kaslow & Evans [79], Osterholm & Hedberg [80]; for a first entry as well as overviews on modelling, the reader is referred to e.g., Grassly & Fraser [81], Keeling & Rohani [82], Yan [83,84], Britton [85], Diekmann, Heesterbeek & Britton [86], Cummings & Lessler [87], Just et al. [88], Britton & Giardina [89], Britton & Pardoux [43]. A survey on the particular role of branching processes in epidemiology can be found e.g., in Jacob [41].

Undoubtedly, by nature, the spreading of an infectious disease through a (human, animal, plant) population is a branching process with possible immigration. Indeed, typically one has the following mechanism:

(D1): at some time $t_{k}^{E}$ –called the time of exposure (moment of infection)—an individual k of a specified population is infected in a wide sense, i.e., entered/invaded/colonized by a number of transmissible disease-causative pathogens (etiologic agents such as viruses, bacteria, protozoans and other parasites, subviruses (e.g., prions and plant viroids), etc.); the individual is then a host (of pathogens);
(D2): depending on the level of immunity and some other factors, these pathogens may multiply/replicate within the host to an extent (over a threshold number) such that at time $t_{k}^{I}$ some of the pathogens start to leave their host (shedding of pathogens); in other words, the individual k becomes infectious at the time $t_{k}^{I}$ of onset of infectiousness. Ex post, one can then say that the individual became infected in the narrow sense at earlier time $t_{k}^{E}$ and call it a primary case. The time interval $[t_{k}^{E}, t_{k}^{I} [$ is called the latent/latency/pre-infectious period of k, and $t_{k}^{I} - t_{k}^{E}$ its duration (in some literature, there is no verbal distinction between them); notice that $t_{k}^{I}$ may differ from the time $t_{k}^{O S}$ of onset (first appearance) of symptoms, which leads to the so-called incubation period $[t_{k}^{E}, t_{k}^{O S} [$ ; if $t_{k}^{I} < t_{k}^{O S}$ then $[t_{k}^{I}, t_{k}^{O S} [$ is called the pre-symptomatic period;
(D3): as long as the individual k stays infectious, by shedding of pathogens it may infect in a narrow sense a random number $Y_{k} \in N_{0}$ of other individuals which are susceptible (i.e., neither immune nor already infected in a narrow sense), where the distribution of $Y_{k}$ depends on the individual’s (natural, voluntary, forced) behaviour, its environment, as well as some other factors e.g., connected with the type of pathogen transmission; the newly infected individuals are called offspring of k, and secondary cases if they are from the same specified population or exportations if they are from a different population; from the view of the latter, these infections are imported cases and thus can be viewed as immigrants;
(D4): at the time $t_{k}^{R}$ of cessation of infectiousness, the individual stops being infectious (e.g., because of recovery, death, or total isolation); the time interval $[t_{k}^{I}, t_{k}^{R} [$ is called the period of infectiousness (also period of communicability, infectious/infective/shedding/contagious period) of k, and $t_{k}^{R} - t_{k}^{I}$ its duration (in some literature, there is no verbal distinction between them); notice that $t_{k}^{R}$ may differ from the time $t_{k}^{C S}$ of cessation (last appearance) of symptoms which leads to the so-called sickness period $[t_{k}^{O S}, t_{k}^{C S} [$ ;
(D5): this branching mechanism continues within the specified population until there are no infectious individuals and also no importations anymore (eradication, full extinction, total elimination)– up to a specified final time (which may be large or even infinite);

All the above-mentioned times

t_{k}^{\cdot}

and time intervals are random, by nature. Two further connected quantities are also important for modelling (see e.g., Yan & Chowell [84] (p. 241ff), including a history of corresponding terminology). Firstly, the generation interval (generation time, transmission interval) is the time interval from the onset of infectiousness in a primary case (called the infector) to the onset of infectiousness in a secondary case (called the infectee) infected by the primary case; clearly, the generation interval is random, and so is its duration (often, the (population-)mean of the latter is also called generation interval). Typically, generation intervals are important ingredients of branching process models of infectious diseases. Secondly, the serial interval describes time interval from the onset of symptoms in a primary case to the onset of symptoms in a secondary case infected by the primary case. By nature, the serial interval is random, and so is its duration (often, the (population-)mean of the latter is also called serial interval). Typically, the serial interval is easier to observe than the generation interval, and thus, the latter is often approximately estimated from data of the former. For further investigations on generation and serial intervals, the reader is referred to e.g., Fine [90], Svensson [91,92], Wallinga & Lipsitch [93], Forsberg White & Pagano [94], Nishiura [95], Scalia Tomba et al. [96], Trichereau et al. [97], Vink, Bootsma & Wallinga [98], Champredon & Dushoff [99], Just et al. [88], and–especially for the novel COVID-19 pandemics—An der Heiden & Hamouda [100], Ferretti et al. [101], Ganyani et al. [102], Li et al. [103], Nishiura, Linton & Akhmetzhanov [104], Park et al. [105].

With the help of the above-mentioned individual ingredients, one can aggregatedly build numerous different population-wide models of infectious diseases in discrete time as well as in continuous time; the latter are typically observed only in discrete-time steps (discrete-time sampling), and hence in the following we concentrate on discrete-time modelling (of the real or the observational process). In fact, we confine ourselves to the important task of modelling the evolution

n \mapsto X_{n}

of the number of incidences at “stage” n, where incidence refers to the number of new infected/infectious individuals. Here, n may be a generation number where, inductively,

n = 0

refers to the generation of the first appearing primary cases in the population (also called initial importations), and n refers to the generation of offsprings of all individuals of generation

n - 1

. Alternatively, n may be the index of a physical (“calender”) point of time

t_{n}

, which may be deterministic or random; e.g.,

{(t_{n})}_{n \in N}

may be a strictly increasing series of (i) equidistant deterministic time points (and thus, one can identify

t_{n} = n

in appropriate time units such as days, weeks, bi-weeks, months), or (ii) non-equidistant deterministic time points, or (iii) random time points (as a side remark, let us mention that in some situations,

X_{n}

may alternatively denote the number of prevalences at “stage” n, where prevalence refers to the total number of infected/infectious individuals (e.g., through some methodical tricks like “self-infection”)).

In the light of this, one can loosely define an epidemic as the rapid spread of an infectious disease within a specified population, where the numbers

X_{n}

of incidences are high (or much higher than expected) for that kind of population. A pandemic is a geographically large-scale (e.g., multicontinental or worldwide) epidemic. An outbreak/onset of an epidemic in the narrow sense is the (time of) change where an infectious disease turns into an epidemic, which is typically quantified by exceedance over an threshold; analogously, an outbreak/onset of a pandemic is the (time of) change where the epidemic turns into a pandemic. Of course, one goal of infectious-disease modelling is to quantify “early enough” the potential danger of an emerging outbreak of an epidemic or a pandemic.

Returning to possible models of the incidence-evolution

n \mapsto X_{n}

, its description may be theoretically derived from more detailed, time-finer, highly sophisticated, individual-based “mechanistic” infectious-disease models such as e.g., continuous-time suscetible-exposed-infectious-recovered (SEIR) models (see the above-mentioned introductory texts); however, as e.g., pointed out in Held et al. [72], the estimation of the correspondingly involved numerous parameters may be too ambitious for routinely collected, non-detailed disease data, such as e.g., daily/weekly counts

X_{n}

of incidences–especially in decisive emerging/early phases of a novel disease (such as the current COVID-19 pandemic). Accordingly, in the following we assume that

X_{n}

can be approximately described by a Poissonian Galton-Watson process with immigration respectively a (“distributionally equal”) Poissonian autoregressive Generalized Linear Model in the sense of (M2). Depending on the situation, this can be quite reasonable, for the following arguments (apart from the usual “if the data say so”). Firstly, it is well known (see e.g., Bartoszynski [33], Ludwig [34], Becker [35,36], Metz [37], Heyde [38], von Bahr & Martin-Löf [39], Ball [40], Jacob [41], Barbour & Reinert [42], Section 1.2 of Britton & Pardoux [43]) that in populations with a relatively high number of susceptible individuals and a relatively low number of infectious individuals (e.g., in a large population and in decisive emerging/early phases of the disease spreading), the incidence-evolution

n \mapsto X_{n}

can be well approximated by a (e.g., Poissonian) Galton-Watson process with possible immigration where n plays the role of a generation number. If the above-mentioned generation interval is “nearly” deterministic (leading to nearly synchronous, non-overlapping generations)—which is the case e.g., for (phases of) Influenza A(H1N1)pdm09, Influenza A(H3N2), Rubella (cf. Vink, Bootsma & Wallinga [98]), and COVID-19 (cf. Ferretti et al. [101])—and the length of the generation interval is approximated by its mean length and the latter is tuned to be equal to the unit time between consecutive observations, then n plays the role of an observation (surveillance) time. This effect is even more realistic if the period of infectiousness is nearly deterministic and relatively short. Secondly, as already mentioned above, the spreading of an infectious disease is intrinsically a (not necessarily Poissonian Galton-Watson) branching mechanism, which may be blurred by other effects in a way that a Poissonian autoregressive Generalized Linear Model is still a reasonably fitting model for the observational process in disease surveillance. The latter have been used e.g., by Finkenstädt, Bjornstad & Grenfell [73], Held, Höhle & Hofmann [47], and Held et al. [72]; they all use non-constant parameters (e.g., to describe seasonal effects, which are however unknown in early phases of a novel infectious disease such as COVID-19). In contrast, we employ different new–namely divergence-based–statistical techniques, for which we assume constant parameters but also indicate procedures for the detection of changes; the extension to non-constant parameters is straightforward.

Returning to Galton-Watson processes, let us mention as a side remark that they can be also used to model the above-mentioned within-host replication dynamics (D2) (e.g., in the time-interval

[t_{k}^{E}, t_{k}^{I} [

and beyond) on a sub-cellular level, see e.g., Spouge [106], as well as Taneyhill, Dunn & Hatcher [107] for parasitic pathogens; on the other hand, one can also employ Galton-Watson processes for quantifying snowball-effect (avalanche-effect, cascade-effect) type, economic-crisis triggered consequences of large epidemics and pandemics, such as e.g., the potential spread of transmissible (i) foreclosures of homes (cf. Parnes [108]), or clearly also (ii) company insolvencies, downsizings and credit-risk downgradings; moreover, the time-evolution of integer-valued indicators concerning the spread of (rational or unwarranted) fears resp. perceived threats may be modelled, too.

Summing up things, we model the evolution

n \mapsto X_{n}

of the number of incidences at stage n by a Poissonian Galton Watson process with immigration GWI

X_{0} \in N; N_{0} ∋ X_{n} = \sum_{k = 1}^{X_{n - 1}} Y_{n - 1, k} + {\tilde{Y}}_{n}, n \in N, cf . (1), (GWI 1) - (GWI 3) with law P_{•},

(where

Y_{n - 1, k}

corresponds to the

Y_{k}

of (D3), equipped with an additional stage-index

n - 1

), respectively by a corresponding “distributionally equal”–possibly non-stationary– Poissonian autoregressive Generalized Linear Model in the sense of (M2); depending on the situation, we may also fix a (deterministic or random) upper time horizon other than infinity. Recall that both models are overdispersed, which is consistent with the current debate on overdispersion in connection with the current COVID-19 pandemic. In infectious-disease language, the sum

\sum_{k = 1}^{X_{n - 1}} Y_{n - 1, k}

can also be loosely interpreted as epidemic component (in a narrow sense) driven by the parameter

β_{•}

, and

{\tilde{Y}}_{n}

as endemic component driven by the parameter

α_{•}

. In fact, the offspring mean (here,

β_{•}

) is called reproduction number and plays a major role–also e.g., in the current public debate about the COVID-19 pandemic–because it crucially determines the rapidity of the spread of the disease and—as already indicated above in the second and third paragraph after (PS3)–also the probability that the epidemic/pandemic becomes (maybe temporally) extinct or at least stationary at a low level (that is, endemic). For this to happen,

β_{•}

should be subcritical, i.e.,

β_{•} < 1

, and even better, close to zero. Of course, the size of the importation mean

α_{•} \geq 0

matters, too, in a secondary order.

Keeping this in mind, let us discuss on which factors the reproduction number

β_{•}

and the importation mean

α_{•}

depend upon, and how they can be influenced/controlled. To begin with, by recalling the above-mentioned points (D1) to (D5) and by adapting the considerations of e.g., Grassly & Fraser [81] to our model, one encounters the fact that the distribution of the offspring

Y_{n - 1, k}

—here driven by the reproduction number (offspring mean)

β_{•}

—depends on the following factors:

(B1)

the degree of infectiousness of the individual k, with three major components:

(B1a): degree of biological infectiousness; this reflects the within-host dynamics (D2) of the “representative” individual k, in particular the duration and amount of the corresponding replication and shedding/excretion of the infectious pathogens; this degree depends thus on (i) the number of host-invading pathogens (called the initial infectious dose), (ii) the type of the pathogen with respect to e.g., its principal capabilities of replication speed, range of spread and drug-sensitivity, (iii) features of the immune system of the host k including the level of innate or acquired immunity, and (iv) the interaction between the genetic determinants of disease progression in both the pathogen and the host;
(B1b): degree of behavioural infectiousness; this depends on the contact patterns of an infected/infectious individual (and, if relevant, the contact patterns of intermediate hosts or vectors), in relation to the disease-specific type of route(s) of transmission of the infectious pathogens (for an overview of the latter, see e.g., Table 3 of Kaslow & Evans [79]); a long-distance-travel behaviour may also lead to the disease exportation to another, outside population (and thus, for the latter to a disease importation);
(B1c): degree of environmental infectiousness; this depends on the location and environment of the host k, which influences the duration of outside-host survival of the pathogens (and, if relevant, of the intermediate hosts or vectors) as well as the speed and range of their outside-host spread; for instance, high temperature may kill the pathogens, high airflow or rainfall dynamics may ease their spread, etc.

(B2)

the degree of susceptibility of uninfected individuals who have contact with k, with the following three major components (with similar background as their infectiousness counterparts):

(B2a): degree of biological susceptibility;
(B2b): degree of behavioural susceptibility;
(B2c): degree of environmental susceptibility.

All these factors (B1a) to (B2c) can be principally influenced/controlled to a certain–respective–extent. Let us briefly discuss this for human infectious diseases, where one major goal of epidemic risk management is to operate countermeasures/interventions in order to slow down the disease transmission (e.g., by reducing the reproduction number

β_{•}

to less than 1) and eventually even break the chain of transmission, for the sake of containment or mitigation; preparedness and preparation are motives, too, for instance as a part of governmental pandemic risk management.

For instance, (B1a) can be reduced or even erased through pharmaceutical interventions such as medication (if available), and preventive strengthening of the immune system through non-extreme sports activities and healthy food.

Moreover, the following exemplary control measures for (B2) can be either put into action by common-sense self-behaviour, or by large-scale public recommendations (e.g., through mass media), or by rules/requirements from authorities:

(i): personal preventive measures such as frequent washing and disinfecting of hands; keeping hands away from face; covering coughs; avoidance of handshakes and hugs with non-family-members; maintaining physical distance (e.g., of two meters) from non-family-members; wearing a face-mask of respective security degree (such as homemade cloth face mask, particulate-filtering face-piece respirator, medical (non-surgical) mask, surgical mask); self-quarantine;
(ii): environmental measures, such as e.g., cleaning of surfaces;
(iii): community measures aimed at mild or stringent social distancing, such as e.g., prohibiting/cancelling/banning gatherings of more than z non-family members (e.g., $z = 2, 5, 10, 100, 1000$ in various different phases and countries during the current COVID-19 pandemic); mask-wearing (see above); closing of schools, universities, some or even all nonessential (“system-irrelevant”) businesses and venues; home-officing/work ban; home isolation of disease cases; isolation of homes for the elderly/aged (nursing homes); stay-at-home orders with exemptions, household or even general quarantine; testing & tracing; lockdown of entire cities and beyond; restricting the degrees of travel freedom/allowed mobility (e.g., local, union-state, national, international including border and airport closure). The latter also affects the mean importation rate $α_{•}$ , which can be controlled by vaccination programs in “outside populations”, too.

As far as the degree of biological susceptibility (B2a) is concerned, one obvious therapeutic countermeasure is a mass vaccination program/campaign (if available).

In case of highly virulent infectious diseases causing epidemics and pandemics with substantial fatality rates, some of the above-mentioned control strategies and countermeasures may (have to) be “drastic” (e.g., lockdown), and thus imply considerable social and economic costs, with a huge impact and potential danger of triggering severe social, economic and political disruptions.

In order to prepare corresponding suggestions for decisions about appropriate control measures (e.g., public policies), it is therefore important–especially for a novel infectious disease such as the current COVID-19 pandemic–to have a model for the time-evolution of the incidences in (i) a natural (basically uncontrolled) set-up, as well as in (ii) the control set-ups under consideration. As already mentioned above, we assume that all these situations can be distilled into an incidence evolution

n \mapsto X_{n}

which follows a Poissonian Galton-Watson process with respectively different parameter pairs

(β_{•}, α_{•})

. Correspondingly, we always compare two alternative models

(H)

and

(A)

with parameter pairs

(β_{H}, α_{H})

and

(β_{A}, α_{A})

which reflect either a “pure” statistical uncertainty (under the same uncontrolled or controlled set-up), or the uncertainty between two different potential control set-ups (for the sake of assessing the potential impact/efficiency of some planned interventions, compared with alternative ones); the economic impact can be also taken into account, within a Bayesian decision framework discussed in Section 2.5 below. As will be explained in the next subsections, we achieve such comparisons by means of density-based dissimilarity distances/divergences and related quantities thereof.

From the above-mentioned detailed explanations, it is immediately clear that for the described epidemiological context one should investigate all types of criticality and importation means for the therein involved two Poissonian Galton-Watson processes with/without immigration (respectively the equally distributed INARCH(1) models); in particular, this motivates (or even “justifies”) the necessity of the very lengthy detailed studies in the Section 3, Section 4, Section 5, Section 6 and Section 7 below.

2.4. Information Measures

Having two competing models

(H)

and

(A)

at stake, it makes sense to study questions such as “how far are they apart?” and thus “how dissimilar are they?”. This can be quantified in terms of divergences in the sense of directed (i.e., not necessarily symmetric) distances, where usually the triangular inequality fails. Let us first discuss our employed divergence subclasses in a general set-up of two equivalent probability measures

P_{H}

,

P_{A}

on a measurable space

(Ω, F)

. In terms of the parameter

λ \in R

, the power divergences—also known as Cressie-Read divergences, relative Tsallis entropies, or generalized cross-entropy family– are defined as (see e.g., Liese & Vajda [1,10])

\begin{matrix} 0 \leq I_{λ} (P_{A} ∥ P_{H}) : = \{\begin{matrix} I (P_{A} ∥ P_{H}), & if λ = 1, \\ \frac{1}{λ (λ - 1)} (H_{λ} (P_{A} ∥ P_{H}) - 1), & if λ \in R \ {0, 1}, \\ I (P_{H} ∥ P_{A}), & if λ = 0, \end{matrix} \end{matrix}

(2)

where

I (P_{A} ∥ P_{H}) : = \int p_{A} log \frac{p_{A}}{p_{H}} d μ \geq 0

(3)

is the Kullback-Leibler information divergence (also known as relative entropy) and

H_{λ} (P_{A} ∥ P_{H}) : = \int_{Ω} p_{A}^{λ} p_{H}^{1 - λ} d μ \geq 0

(4)

is the Hellinger integral of order

λ \in R \ {0, 1}

; for this, we assume as usual without loss of generality that the probability measures

P_{H}

,

P_{A}

are dominated by some

σ

–finite measure

μ

, with densities

p_{A} = \frac{d P_{A}}{d μ} and p_{H} = \frac{d P_{H}}{d μ}

(5)

defined on

Ω

(the zeros of

p_{H}

,

p_{A}

are handled in (3) and (4) with the usual conventions). Clearly, for

λ \in {0, 1}

one trivially gets

H_{0} (P_{A} ∥ P_{H}) = H_{1} (P_{A} ∥ P_{H}) = 1 .

The Kullback-Leibler information divergences (relative entropies) in (2) and (3) can alternatively be expressed as (see, e.g., Liese & Vajda [1])

I (P_{A} ∥ P_{H}) = lim_{λ ↗ 1} \frac{1 - H_{λ} (P_{A} ∥ P_{H})}{λ (1 - λ)}, I (P_{H} ∥ P_{A}) = lim_{λ ↘ 0} \frac{1 - H_{λ} (P_{A} ∥ P_{H})}{λ (1 - λ)} .

(6)

Apart from the Kullback-Leibler information divergence (relative entropy), other prominent examples of power divergences are the squared Hellinger distance

\frac{1}{2} I_{1 / 2} (P_{A} ∥ P_{H})

and Pearson’s

χ^{2} -

divergence

2 I_{2} (P_{A} ∥ P_{H})

; the Hellinger integral

H_{1 / 2} (P_{A} ∥ P_{H})

is also known as (multiple of) the Bhattacharyya coefficent. Extensive studies about basic and advanced general facts on power divergences, Hellinger integrals and the related Renyi divergences of order

λ \in R \ {0, 1}

0 \leq R_{λ} (P_{A} ∥ P_{H}) : = \frac{1}{λ (λ - 1)} log H_{λ} (P_{A} ∥ P_{H}), with log 0 = - \infty,

(7)

can be found e.g., in Liese & Vajda [1,10], Jacod & Shiryaev [24], van Erven & Harremoes [20] (as a side remark,

R_{1 / 2} (P_{A} ∥ P_{H})

is also known as (multiple of) Bhattacharyya distance). For instance, the integrals in (3) and (4) do not depend on the choice of

μ

. Furthermore, one has the skew symmetries

H_{λ} (P_{A} ∥ P_{H}) = H_{1 - λ} (P_{H} ∥ P_{A}), as well as I_{λ} (P_{A} ∥ P_{H}) = I_{1 - λ} (P_{H} ∥ P_{A}),

(8)

for all

λ \in R

(see e.g., Liese & Vajda [1]). As far as finiteness is concerned, for

λ \in] 0, 1 [

one gets the rudimentary bounds

\begin{matrix} 0 < H_{λ} (P_{A} ∥ P_{H}) \leq 1, and equivalently, \end{matrix}

(9)

\begin{matrix} 0 \leq I_{λ} (P_{A} ∥ P_{H}) = \frac{1 - H_{λ} (P_{A} ∥ P_{H})}{λ (1 - λ)} < \frac{1}{λ (1 - λ)}, \end{matrix}

(10)

where the lower bound in (10) (upper bound in (9)) is achieved iff

P_{A} = P_{H}

. For

λ \in R \] 0, 1 [

, one gets the bounds

0 \leq I_{λ} (P_{A} ∥ P_{H}) \leq \infty, and equivalently, 1 \leq H_{λ} (P_{A} ∥ P_{H}) \leq \infty,

(11)

where, in contrast to above, both the lower bound of

H_{λ} (P_{A} ∥ P_{H})

and the lower bound of

I_{λ} (P_{A} ∥ P_{H})

is achieved iff

P_{A} = P_{H}

; however, the power divergence

I_{λ} (P_{A} ∥ P_{H})

and Hellinger integral

H_{λ} (P_{A} ∥ P_{H})

might be infinite, depending on the particular setup.

The Hellinger integrals can be also used for bounds of the well-known total variation

0 \leq V (P_{A} ∥ P_{H}) : = 2 sup_{A \in F} \{P_{A} (A) - P_{H} (A)\} = \int_{Ω} |p_{A} - p_{H}| d μ,

with

p_{A}

and

p_{H}

defined in (5). Certainly, the total variation is one of the best known statistical distances, see e.g., Le Cam [109]. For arbitrary

λ \in] 0, 1 [

there holds (cf. Liese & Vajda [1])

1 - \frac{V (P_{A} ∥ P_{H})}{2} \leq H_{λ} (P_{A} ∥ P_{H}) \leq {(1 + \frac{V (P_{A} ∥ P_{H})}{2})}^{max {λ, 1 - λ}} {(1 - \frac{V (P_{A} ∥ P_{H})}{2})}^{min {λ, 1 - λ}} .

From this together with the particular choice

λ = \frac{1}{2}

, we can derive the fundamental universal bounds

2 (1 - H_{\frac{1}{2}} (P_{A} ∥ P_{H})) \leq V (P_{A} ∥ P_{H}) \leq 2 \sqrt{1 - {(H_{\frac{1}{2}} (P_{A} ∥ P_{H}))}^{2}} .

(12)

We apply these concepts to our setup of Section 2.1 with two competing models

(H)

and

(A)

of Galton-Watson processes with immigration, where one can take

Ω \subset N_{0}^{N_{0}}

to be the space of all paths of

{(X_{n})}_{n \in N}

. More detailed, in terms of the extinction set

B : = {τ < \infty}

and the parameter-set notation (PS1) to (PS3), it is known that for

P_{SP}

the two laws

P_{H}

and

P_{A}

are equivalent, whereas for

P_{NI}

the two restrictions

{P_{H}|}_{B}

and

{P_{A}|}_{B}

are equivalent (see e.g., Lemma 1.1.3 of Guttorp [52]); with a slight abuse of notation we shall henceforth omit

|_{B}

. Consistently, for fixed time

n \in N_{0}

we introduce

P_{A, n} : = {P_{A}|}_{F_{n}}

and

P_{H, n} : = {P_{H}|}_{F_{n}}

as well as the corresponding Radon-Nikodym-derivative (likelihood ratio)

Z_{n} : = \frac{d P_{A, n}}{d P_{H, n}},

(13)

where

{(F_{n})}_{n \in N}

denotes the corresponding canonical filtration generated by

X : = {(X_{n})}_{n \in N}

; in other words,

F_{n}

reflects the “process-intrinsic” information known at stage n. Clearly,

Z_{0} = 1

. By choosing the reference measure

μ = P_{H, n}

one obtains from (4) the Hellinger integral

H_{λ} (P_{A, 0} ∥ P_{H, 0}) = 1

, as well as and for all

n \in N

\begin{matrix} H_{λ} (P_{A, n} ∥ P_{H, n}) = E P_{H, n} [{(Z_{n})}^{λ}], \end{matrix}

(14)

\begin{matrix} I (P_{A, n} ∥ P_{H, n}) = E P_{A, n} [log Z_{n}], \end{matrix}

(15)

from which one can immediately build

I_{λ} (P_{A, n} ∥ P_{H, n})

(

λ \in R

) respectively

R_{λ} (P_{A, n} ∥ P_{H, n})

(

λ \in R \ {0, 1}

) respectively bounds of

V (P_{A, n} ∥ P_{H, n})

via (2) respectively (7) respectively (12).

The outcoming values (respectively bounds) of

H_{λ} (P_{A, n} ∥ P_{H, n})

are quite diverse and depend on the choice of the involved parameter pairs

(β_{H}, α_{H})

,

(β_{A}, α_{A})

as well as

λ

; the exact details will be given in the Section 3 and Section 6 below.

Before we achieve this, in the following we explain how the outcoming dissimilarity results can be applied to Bayesian testing and more general Bayesian decision making, as well as to Neyman-Pearson testing.

2.5. Decision Making under Uncertainty

Within the above-mentioned context of two competing models

(H)

and

(A)

of Galton-Watson processes with immigration, let us briefly discuss how knowledge about the time-evolution of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

–or equivalently, of the power divergences

I_{λ} (P_{A, n} ∥ P_{H, n})

, cf. (2)—can be used in order to take decisions under uncertainty, within a framework of Bayesian decision making BDM, or alternatively, of Neyman-Pearson testing NPT.

In our context of BDM, we decide between an action

d_{H}

“associated with” the (say) hypothesis law

P_{H}

and an action

d_{A}

“associated with” the (say) alternative law

P_{A}

, based on the sample path observation

X_{n} : = {X_{l} : l \in {0, 1, \dots, n}}

of the GWI-generation-sizes (e.g., infectious-disease incidences, cf. Section 2.3) up to observation horizon

n \in N

. Following the lines of Stummer & Vajda [15] (adapted to our branching process context), for our BDM let us consider as admissible decision rules

δ_{n} : Ω_{n} \mapsto {d_{H}, d_{A}}

the ones generated by all path sets

G_{n} \in Ω_{n}

(where

Ω_{n}

denotes the space of all possible paths of

{(X_{k})}_{k \in {1, \dots, n}}

) through

\begin{matrix} δ_{n} (X_{n}) : = δ_{G_{n}} (X_{n}) & : = & \{\begin{matrix} d_{A}, & if X_{n} \in G_{n}, \\ d_{H}, & if X_{n} \notin G_{n}, \end{matrix} \end{matrix}

as well as loss functions of the form

(\begin{matrix} L (d_{H}, H) & L (d_{H}, A) \\ L (d_{A}, H) & L (d_{A}, A) \end{matrix}) : = (\begin{matrix} 0 & L_{A} \\ L_{H} & 0 \end{matrix})

(16)

with pregiven constants

L_{A} > 0

,

L_{H} > 0

(e.g., arising as bounds from quantities in worst-case scenarios); notice that in (16),

d_{H}

is assumed to be a zero-loss action under

H

and

d_{A}

a zero-loss action under

A

. Per definition, the Bayes decision rule

δ_{G_{n, \min}}

minimizes–over

G_{n}

—the mean decision loss

\begin{matrix} L (δ_{G_{n}}) : = & p_{H}^{prior} \cdot L_{H} \cdot P r (δ_{G_{n}} (X_{n}) = d_{A} | H) + p_{A}^{prior} \cdot L_{A} \cdot P r (δ_{G_{n}} (X_{n}) = d_{H} | A) \\ = & p_{H}^{prior} \cdot L_{H} \cdot P_{H, n} (G_{n}) + p_{A}^{prior} \cdot L_{A} \cdot P_{A, n} (Ω_{n} - G_{n}) \end{matrix}

(17)

for given prior probabilities

p_{H}^{prior} = P r (H) \in] 0, 1 [

for

H

and

p_{A}^{prior} : = P r (A) = 1 - p_{H}^{prior}

for

A

. As a side remark let us mention that, in a certain sense, the involved model (parameter) uncertainty expressed by the “superordinate” Bernoulli-type law

P r = B i n (1, p_{H}^{prior})

can also be reinterpreted as a rudimentary static random environment caused e.g., by a random Bernoulli-type external static force. By straightforward calculations, one gets with (13) the minimizing path set

G_{n, \min} = \{Z_{n} \geq \frac{p_{H}^{prior} L_{H}}{p_{A}^{prior} L_{A}}\}

leading to the minimal mean decision loss, i.e., the Bayes risk,

\begin{matrix} R_{n} : = min_{G_{n}} L (δ_{G_{n}}) = L (δ_{G_{n, \min}}) & = & \int_{Ω_{n}} min \{p_{H}^{prior} L_{H}, p_{A}^{prior} L_{A} Z_{n}\} d P_{H, n} . \end{matrix}

(18)

Notice that—by straightforward standard arguments—the alternative decision procedure

take action d_{A} (resp . d_{H}) if L_{H} \cdot p_{H}^{post} (X_{n}) \leq (resp . >) L_{A} \cdot p_{A}^{post} (X_{n})

with posterior probabilities

p_{H}^{post} (X_{n}) : = \frac{p_{H}^{prior}}{(1 - p_{H}^{prior}) \cdot Z_{n} (X_{n}) + p_{H}^{prior}} = : 1 - p_{A}^{post} (X_{n})

, leads exactly to the same actions as

δ_{G_{n, \min}}

. By adapting the Lemma 6.5 of Stummer & Vajda [15]—which on general probability spaces gives fundamental universal inequalities relating Hellinger integrals (or equivalently, power divergences) and Bayes risks—one gets for all

L_{H} > 0

,

L_{A} > 0

,

p_{H}^{prior} \in] 0, 1 [

,

λ \in] 0, 1 [

and

n \in N

the upper bound

R_{n} \leq Λ_{A}^{λ} Λ_{H}^{1 - λ} H_{λ} (P_{A, n} ∥ P_{H, n}), with Λ_{H} : = p_{H}^{p r i o r} L_{H}, Λ_{A} : = (1 - p_{H}^{p r i o r}) L_{A},

(19)

as well as the lower bound

{(R_{n})}^{min {λ, 1 - λ}} \cdot {(Λ_{H} + Λ_{A} - R_{n})}^{max {λ, 1 - λ}} \geq Λ_{A}^{λ} Λ_{H}^{1 - λ} H_{λ} (P_{A, n} ∥ P_{H, n})

which implies in particular the “direct” lower bound

R_{n} \geq \frac{Λ_{A}^{max {1, \frac{λ}{1 - λ}}} Λ_{H}^{max {1, \frac{1 - λ}{λ}}}}{{(Λ_{A} + Λ_{H})}^{max {\frac{λ}{1 - λ}, \frac{1 - λ}{λ}}}} \cdot {(H_{λ} (P_{A, n} ∥ P_{H, n}))}^{max {\frac{1}{λ}, \frac{1}{1 - λ}}} .

(20)

By using (19) (respectively (20)) together with the exact values and the upper (respectively lower) bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

derived in the following sections, we end up with upper (respectively lower) bounds of the Bayes risk

R_{n}

. Of course, with the help of (2) the bounds (19) and (20) can be (i) immediately rewritten in terms of the power divergences

I_{λ} (P_{A, n} ∥ P_{H, n})

and (ii) thus be directly interpreted in terms of dissimilarity-size arguments. As a side-remark, in such a Bayesian context the

λ -

order Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n}) = E P_{H, n} [{(Z_{n})}^{λ}]

(cf. (14)) can be also interpreted as

λ -

order Bayes-factor moment (with respect to

P_{H, n}

), since

Z_{n} = Z_{n} (X_{n}) = \frac{p_{A}^{post} (X_{n})}{p_{H}^{post} (X_{n})} / \frac{p_{A}^{prior}}{p_{H}^{prior}}

is the Bayes factor (i.e., the posterior odds ratio of

(A)

to

(H)

, divided by the prior odds ratio of

(A)

to

(H)

).

At this point, the potential applicant should be warned about the usual way of asynchronous decision making, where one first tests

(A)

versus

(H)

(i.e.,

L_{A} = L_{H} = 1

which leads to 0–1 losses in (16)) and afterwards, based on the outcoming result (e.g., in favour of

(A)

), takes the attached economic decision (e.g.,

d_{A}

); this can lead to distortions compared with synchronous decision making with “full” monetary losses

L_{A}

and

L_{H}

, as is shown in Stummer & Lao [16] within an economic context in connection with discrete approximations of financial diffusion processes (they call this distortion effect a non-commutativity between Bayesian statistical and investment decisions).

For different types of–mainly parameter estimation (squared-error type loss function) concerning—Bayesian analyses based on GW(I) generation size observations, see e.g., Jagers [56], Heyde [38], Heyde & Johnstone [110], Johnson et al. [111], Basawa & Rao [60], Basawa & Scott [61], Scott [112], Guttorp [52], Yanev & Tsokos [113], Mendoza & Gutierrez-Pena [114], and the references therein.

Within our running-example epidemiological context of Section 2.3, let us briefly discuss the role of the above-mentioned losses

L_{A}

and

L_{H}

. To begin with, as mentioned above the unit-free choice

L_{A} = L_{H} = 1

corresponds to Bayesian testing. Recall that this concerns with two alternative infectious-disease models

(H)

and

(A)

with parameter pairs (recall the interpretation of

β_{•}

as reproduction number and

α_{•}

as importation mean)

(β_{H}, α_{H})

and

(β_{A}, α_{A})

which reflect either a “pure” statistical uncertainty (under the same uncontrolled or controlled set-up), or the uncertainty between two different potential control set-ups (for the sake of assessing the potential impact/efficiency of some planned interventions, compared with alternative ones). As far as non-unit-free–e.g., macroeconomic or monetary–losses is concerned, recall that some of the above-mentioned control strategies (countermeasures, public policies, governmental pandemic risk management plans) may imply considerable social and economic costs, with a huge impact and potential danger of triggering severe social, economic and political disruptions; a corresponding tradeoff between health and economic issues can be incorporated by choosing

L_{A}

and

L_{H}

to be (e.g., monetary) values which reflect estimates or upper bounds of losses due to wrong decisions, e.g., if at stage n due to the observed data one erroneously thinks (reinforced by fear) that a novel infectious disease (e.g., COVID-19) will lead (or re-emerge) to a severe pandemic and consequently decides for a lockdown with drastic future economic consequences, versus, if one erroneously thinks (reinforced by carelessness) that the infectious disease is (or stays) non-severe and consequently eases some/all control measures which will lead to extremely devastating future economic consequences. For the estimates/bounds of

L_{A}

and

L_{H}

, one can e.g., employ (i) the comprehensive stochastic studies of Feicht & Stummer [115] on the quantitative degree of elasticity and speed of recovery of economies after a sudden macroeconomic disaster, or (ii) the more short-term, German-specific, scenario-type (basically non-stochastic) studies of Dorn et al. [116,117] in connection with the current COVID-19 pandemic.

Of course, the above-mentioned Bayesian decision procedure can be also operated in sequential way. For instance, suppose that we are encountered with a novel infectious disease (e.g., COVID-19) of non-negligible fatality rate and let

(A)

reflect a “potentially dangerous” infectious-disease-transmission situation (e.g., a reproduction number of substantially supercritical case

β_{A} = 2

, and an importation mean of

α_{A} = 10

, for weekly appearing new incidence-generations) whereas

(H)

describes a “relatively harmless/mild” situation (e.g., a substantially subcritical

β_{H} = 0.5

,

α_{H} = 0.2

). Moreover, let

d_{A}

respectively

d_{H}

denote (non-quantitatively) the decision/action to accept

(A)

respectively

(H)

. It can then be reasonable to decide to stop the observation process

n \mapsto X_{n}

(also called surveillance or online-monitoring) of incidence numbers at the first time at which

n \mapsto Z_{n} = Z_{n} (X_{n})

exceeds the threshold

p_{H}^{prior} / p_{A}^{prior}

; if this happens, one takes

d_{A}

as decision (and e.g., declare the situation as occurrence of an epidemic outbreak and start with control/intervention measures (however, as explained above, one should synchronously involve also the potential economic losses)) whereas as long as this does not happen, one continues the observation (and implicitly takes

d_{H}

as decision). This can be modelled in terms of the pair

(\tilde{τ}, d_{A})

with (random) stopping time

\tilde{τ} : = inf \{n \in N : Z_{n} \geq \frac{p_{H}^{prior}}{p_{A}^{prior}}\}

(with the usual convention that the infimum of the empty set is infinity), and the corresponding decision

d_{A}

. After the time

\tilde{τ} < \infty

and e.g., immediate subsequent employment of some control/counter measures, one can e.g., take the old model

(A)

as new

(H)

, declare a new target

(A)

for the desired quantification of the effectiveness of the employed control measures (e.g., a mitigation to a slightly subcritical case of

β_{A} = 0.95

,

α_{H} = 0.8

), and starts to observe the new incidence numbers until the new target

(A)

has been reached. This can be interpreted as online-detection of a distributional change; a related comprehensive new framework for the use of divergences (even much beyond power divergences) for distributional change detection can be found e.g., in the recent work of Kißlinger & Stummer [118]. A completely different, SIR-model based, approach for the detection of change points in the spread of COVID-19 is given in Dehning et al. [119]. Moreover, other different surveillance methods can be also found e.g., in the corresponding overview of Frisen [120] and the Swedish epidemics outbreak investigations of Friesen & Andersson & Schiöler [121].

One can refine the above-mentioned sequential procedure via two (instead of one) appropriate thresholds

c_{1} < c_{2}

and the pair

(\overset{˘}{τ}, δ_{\overset{˘}{τ}})

, with the stopping time

\overset{˘}{τ} : = inf \{n \in N : Z_{n} \notin [c_{1}, c_{2}]\}

as well as corresponding decision rule

\begin{matrix} δ_{\overset{˘}{τ}} & : = & \{\begin{matrix} d_{A}, if Z_{\overset{˘}{τ}} > c_{2}, \\ d_{H}, if Z_{\overset{˘}{τ}} < c_{1} . \end{matrix} \end{matrix}

An exact optimized treatment on the two above-mentioned sequential procedures, and their connection to Hellinger integrals (and power divergences) of Galton-Watson processes with immigration, is beyond the scope of this paper.

As a side remark, let us mention that our above-mentioned suggested method of Bayesian decision making with Hellinger integrals of GWIs differs completely from the very recent work of Brauner et al. [122] who use a Bayesian hierarchical model for the concrete, very comprehensive study on the effectiveness and burden of non-pharmaceutical interventions against COVID-19 transmission.

The power divergences

I_{λ} (P_{A, n} ∥ P_{H, n})

(

λ \in R

) can be employed also in other ways within Bayesian decision making, of statistical nature. Namely, by adapting the general lines of Österreicher & Vajda [123] (see also Liese & Vajda [10], as well as diffusion-process applications in Stummer [5,31,32]) to our context of Galton-Watson processes with immigration, we can proceed as follows. For the sake of comfortable notations, we first attach the value

θ : = 1

to the GWI model

(A)

(which has prior probability

p_{A}^{prior} \in] 0, 1 [

) and

θ : = 0

to

(H)

(which has prior probability

1 - p_{A}^{prior}

). Suppose we want to decide, in an optimal Bayesian way, which degree of evidence

d e g \in [0, 1]

we should attribute (according to a pregiven loss function

LO

) to the model

(A)

. In order to achieve this goal, we choose a nonnegatively-valued loss function

LO (θ, d e g)

defined on

{0, 1} \times [0, 1]

, of two types which will be specified below. The risk at stage 0 (i.e., prior to the GWI-path observations

X_{n}

), from the optimal decision about the degree of evidence

d e g

concerning the decision parameter

θ

, is defined as

{BR}_{LO} (p_{A}^{prior}) : = min_{d e g \in [0, 1]} {(1 - p_{A}^{prior}) \cdot LO (0, d e g) + p_{A}^{prior} \cdot LO (1, d e g)},

which can be thus interpreted as a minimal prior expected loss (the minimum will always exist). The corresponding risk posterior to the GWI-path observations

X_{n}

, from the optimal decision about the degree of evidence

d e g

concerning the parameter

θ

, is given by

{BR}_{LO}^{post} (p_{A}^{prior}) : = \int_{Ω_{n}} {BR}_{LO} (p_{A}^{post} (X_{n})) (p_{A}^{prior} d P_{A, n} + (1 - p_{A}^{prior}) d P_{H, n}),

which is achieved by the optimal decision rule (about the degree of evidence)

D^{*} (X_{n}) : = arg min_{d e g \in [0, 1]} {(1 - p_{A}^{post} (X_{n})) \cdot LO (0, d e g) + p_{A}^{post} (X_{n}) \cdot LO (1, d e g)} .

The corresponding statistical information measure (in the sense of De Groot [124])

Δ {BR}_{LO} (p_{A}^{prior}) : = {BR}_{LO} (p_{A}^{prior}) - {BR}_{LO}^{p o s t} (p_{A}^{prior}) \geq 0

represents the reduction of the decision risk about the degree of evidence

d e g

concerning the parameter

θ

, that can be attained by observing the GWI-path

X_{n}

until stage n. For the first-type loss function

\tilde{LO} (θ, d e g) : = d e g - (2 d e g - 1) \cdot 1_{{1}} (θ)

, defined on

{0, 1} \times [0, 1]

with the help of the indicator function

1_{A} (.)

on the set A, one can show that

\begin{matrix} D^{*} (X_{n}) & : = & \{\begin{matrix} 0, if p_{A}^{post} (X_{n}) \in [0, \frac{1}{2} [, \\ 1, if p_{A}^{post} (X_{n}) \in] \frac{1}{2}, 1 [, \\ any number in [0, 1], if p_{A}^{post} (X_{n}) = \frac{1}{2}, \end{matrix} \end{matrix}

as well as the representation formula

I_{λ} (P_{A, n} ∥ P_{H, n}) = \int_{0}^{1} Δ {BR}_{\tilde{LO}} (p_{A}^{prior}) \cdot {(1 - p_{A}^{prior})}^{λ - 2} \cdot {(p_{A}^{prior})}^{- 1 - λ} d p_{A}^{prior}, λ \in R,

(21)

(cf. Österreicher & Vajda [123], Liese & Vajda [10], adapted to our GWI context); in other words, the power divergence

I_{λ} (P_{A, n} ∥ P_{H, n})

can be regarded as a weighted-average statistical information measure (weighted-average decision risk reduction). One can also use other weights of

p_{A}^{prior}

in order to get bounds of

I_{λ} (P_{A, n} ∥ P_{H, n})

(analogously to Stummer [5]).

For the second-type loss function

{LO}_{λ, χ} (θ, d e g) : = \frac{λ^{θ - 1} {d e g}^{λ - θ}}{χ^{λ} {(1 - χ)}^{1 - λ} {(1 - λ)}^{θ} {(1 - d e g)}^{λ - θ}}

defined on

{0, 1} \times [0, 1]

with parameters

λ \in] 0, 1 [

and

χ \in] 0, 1 [

, one can derive the optimal decision rule

D^{*} (X_{n}) = p_{A}^{post} (X_{n})

as well as the representation formula as a limit statistical information measure (limit decision risk reduction)

I_{λ} (P_{A, n} ∥ P_{H, n}) = lim_{χ \to p_{A}^{prior}} Δ {BR}_{{LO}_{λ, χ}} (p_{A}^{prior}) = : Δ {BR}_{{LO}_{λ, p_{A}^{prior}}} (p_{A}^{prior})

(22)

(cf. Österreicher & Vajda [123], Stummer [5], adapted to our GWI context).

As an alternative to the above-mentioned Bayesian-decision-making applications of Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

, let us now briefly discuss the use of the latter for the corresponding Neyman-Pearson (NPT) framework with randomized tests

T_{n} : Ω_{n} \mapsto [0, 1]

of the hypothesis

P_{H}

against the alternative

P_{A}

, based on the GWI-generation-size sample path observations

X_{n} : = {X_{l} : l \in {0, 1, \dots, n}}

. In contrast to (17) and (18) a Neyman-Pearson test minimizes—over

T_{n}

–the type II error probability

\int_{Ω_{n}} (1 - T_{n}) d P_{A, n}

in the class of the tests for which the type I error probability

\int_{Ω_{n}} T_{n} d P_{H, n}

is at most

ς \in] 0, 1 [

. The corresponding minimal type II error probability

E_{ς} (P_{A, i} ∥ P_{H, i}) : = inf_{T_{i} : \int_{Ω_{i}} T_{i} d P_{H, i} \leq ς} \int_{Ω_{i}} (1 - T_{i}) d P_{A, i}

can for all

ς \in] 0, 1 [

,

λ \in] 0, 1 [

,

i \in I

be bounded from above by

E_{ς} (P_{A, i} ∥ P_{H, i}) \leq E_{ς}^{U} (P_{A, i} ∥ P_{H, i}) : = min \{(1 - λ) \cdot {(\frac{λ}{ς})}^{λ / (1 - λ)} \cdot {(H_{λ} (P_{A, i} ∥ P_{H, i}))}^{1 / (1 - λ)}, 1\},

(23)

and for all

λ > 1

,

i \in I

it can be bounded from below by

E_{ς} (P_{A, i} ∥ P_{H, i}) \geq E_{ς}^{L} (P_{A, i} ∥ P_{H, i}) : = {(1 - ς)}^{λ / (λ - 1)} \cdot {(H_{λ} (P_{A, i} ∥ P_{H, i}))}^{1 / (1 - λ)},

(24)

which is an adaption of a general result of Krafft & Plachky [125], see also Liese & Vajda [1] as well as Stummer & Vajda [15]. Hence, by combining (23) and (24) with the exact values respectively upper bounds of the Hellinger integrals

H_{1 - λ} (P_{A, n} ∥ P_{H, n})

from the following sections, we obtain for our context of Galton-Watson processes with Poisson offspring and Poisson immigration (including the non-immigration case) some upper bounds of

E_{ς} (P_{A, n} ∥ P_{H, n})

, which can also be immediately rewritten as lower bounds for the power

1 - E_{ς} (P_{A, n} ∥ P_{H, n})

of a most powerful test at level

ς

. In contrast to such finite-time-horizon results, for the (to our context) incompatible setup of Galton-Watson processes with Poisson offspring but nonstochastic immigration of constant value 1, the asymptotic rates of decrease as

n \to \infty

of the unconstrained type II error probabilities as well as the type I error probabilites were studied in Linkov & Lunyova [53] by a different approach employing also Hellinger integrals. Some other types of Galton-Watson-process concerning Neyman-Pearson testing investigations different to ours can be found e.g., in Basawa & Scott [126], Feigin [127], Sweeting [128], Basawa & Scott [61], and the references therein.

2.6. Asymptotical Distinguishability

The next two concepts deal with two general families

{(P_{A, i})}_{i \in I}

and

{(P_{H, i})}_{i \in I}

of probability measures on the measurable spaces

{(Ω_{i}, F_{i})}_{i \in I}

, where the index set

I

is either

N_{0}

or

R_{+}

. For them, the following two general types of asymptotical distinguishability are well known (see e.g., LeCam [109], Liese & Vajda [1], Jacod & Shiryaev [24], Linkov [129], and the references therein).

Definition 1.

The family

{(P_{A, i})}_{i \in I}

is contiguous to the family

{(P_{H, i})}_{i \in I}

– in symbols,

(P_{A, i}) ◃ (P_{H, i})

– if for all sets

A_{i} \in F_{i}

with

{lim}_{i \to \infty} P_{H, i} (A_{i}) = 0

there holds

{lim}_{i \to \infty} P_{A, i} (A_{i}) = 0

.

Definition 2.

Families of measures

{(P_{A, i})}_{i \in I}

and

{(P_{H, i})}_{i \in I}

are called entirely separated (completely asymptotically distinguishable)—in symbols,

(P_{A, i}) △ (P_{H, i})

–if there exist a sequence

i_{m} ↑ \infty

as

m ↑ \infty

and for each

m \in N_{0}

an

A_{i_{m}} \in F_{i_{m}}

such that

{lim}_{m \to \infty} P_{A, i_{m}} (A_{i_{m}}) = 1

and

{lim}_{m \to \infty} P_{H, i_{m}} (A_{i_{m}}) = 0

.

It is clear that the notion of contiguity is the attempt to carry the concept of absolute continuity over to families of measures. Loosely speaking,

(P_{A, i})

is contiguous to

(P_{H, i})

, if the limit

{lim}_{i \to \infty} (P_{A, i})

(existence preconditioned) is absolute continuous to the limit

{lim}_{i \to \infty} (P_{H, i})

. However, for the definition of contiguity, we do not need to require the probability measures to converge to limiting probability measures. On the other hand, entire separation is the generalization of singularity to families of measures.

The corresponding negations will be denoted by

\bar{◃}

and

\bar{△}

. One can easily check that a family

(P_{A, i})

cannot be both contiguous and entirely separated to a family

(P_{H, i})

. In fact, as shown in Linkov [129], the relation between the families

(P_{A, i})

and

(P_{H, i})

can be uniquely classified into the following distinguishability types:

(a): $(P_{A, i}) ◃ ▹ (P_{H, i})$ ;
(b): $(P_{A, i}) ◃ (P_{H, i})$ , $(P_{H, i}) \bar{◃} (P_{A, i})$ ;
(c): $(P_{A, i}) \bar{◃} (P_{H, i})$ , $(P_{H, i}) ◃ (P_{A, i})$ ;
(d): $(P_{A, i}) \bar{◃} \bar{▹} (P_{H, i})$ , $(P_{A, i}) \bar{△} (P_{H, i})$ ;
(e): $(P_{A, i}) △ (P_{H, i})$ .

As demonstrated in the above-mentioned references for a general context, one can conclude the type of distinguishability from the time-evolution of Hellinger integrals. Indeed, the following assertions can be found e.g., in Linkov [129], where part (c) was established in Liese & Vajda [1] and (f), (g) in Vajda [3].

Proposition 1.

The following assertions are equivalent:

\begin{matrix} (a) & (P_{A, i}) △ (P_{H, i}), \\ (b) & \underset{i \to \infty}{lim inf} H_{λ} (P_{A, i} ∥ P_{H, i}) = 0 for all λ \in] 0, 1 [, \\ (c) & there exists a λ \in] 0, 1 [: \underset{i \to \infty}{lim inf} H_{λ} (P_{A, i} ∥ P_{H, i}) = 0, \\ (d) & there exists a π \in] 0, 1 [: \underset{i \to \infty}{lim inf} e_{π} (P_{A, i} ∥ P_{H, i}) = 0, \\ (e) & \underset{i \to \infty}{lim sup} V (P_{A, i} ∥ P_{H, i}) = 2, \\ (f) & there exists a λ \in] 0, 1 [: \underset{i \to \infty}{lim sup} I_{λ} (P_{A, i} ∥ P_{H, i}) = \frac{1}{λ \cdot (1 - λ)}, \\ (g) & \underset{i \to \infty}{lim sup} I_{λ} (P_{A, i} ∥ P_{H, i}) = \frac{1}{λ \cdot (1 - λ)}, for all λ \in] 0, 1 [. \end{matrix}

(25)

In combination with the discussion after Definition 2, one can thus interpret the

λ -

order Hellinger integral

H_{λ} (P_{A, i} ∥ P_{H, i})

as a “measure” for the distinctness of the two families

P_{A, i}

and

P_{H, i}

up to a fixed finite time horizon

i \in I

.

Furthermore, for the contiguity we obtain the equivalence (see e.g., Liese & Vajda [1], Linkov [129])

\begin{matrix} (P_{A, i}) ◃ (P_{H, i}) & ⟺ & \underset{λ ↗ 1}{lim inf} \{\underset{i \to \infty}{lim inf} H_{λ} (P_{A, i} ∥ P_{H, i})\} = 1 \\ ⟺ & \underset{λ ↗ 1}{lim sup} \{\underset{i \to \infty}{lim sup} λ \cdot (1 - λ) \cdot I_{λ} (P_{A, i} ∥ P_{H, i})\} = 0 . \end{matrix}

(26)

All the above-mentioned general results can be applied to our context of two competing Poissonian Galton-Watson processes with immigration (GWI)

(H)

and

(A)

(reflected by the two different laws

P_{H}

resp.

P_{A}

with parameter pairs

(β_{H}, α_{H})

resp.

(β_{A}, α_{A})

), by taking

P_{A, i} : = {P_{A}|}_{F_{i}}

and

P_{H, i} : = {P_{H}|}_{F_{i}}

. Recall from the preceding subsections (by identifying i with n) that the latter two describe the stochastic dynamics of the respective GWI within the restricted time-/stage-frame

{0, 1, \dots, i}

.

In the following, we study in detail the evolution of Hellinger integrals between two competing models of Galton-Watson processes with immigration, which turns out to be quite extensive.

3. Detailed Recursive Analyses of Hellinger Integrals

3.1. A First Basic Result

In terms of our notations (PS1) to (PS3), a typical situation for applications in our mind is that one particular constellation

(β_{A}, β_{H}, α_{A}, α_{H}) \in P

(e.g., obtained from theoretical or previous statistical investigations) is fixed, whereas–in contrast–the parameter

λ \in R \ {0, 1}

for the Hellinger integral or the power divergence might be chosen freely, e.g., depending on which (transform of a) dissimilarity measure one decides to choose for further analysis. At this point, let us emphasize that in general we will not make assumptions of the form

β_{•} ⪌ 1

, i.e., upon the type of criticality.

To start with our investigations, in order to justify for all

n \in N_{0}

Z_{n} : = \frac{d P_{A, n}}{d P_{H, n}} (cf . (13)),

(14) and (15) (as well as

I_{λ} (P_{A, n} ∥ P_{H, n})

for

λ \in R

respectively

R_{λ} (P_{A, n} ∥ P_{H, n})

for

λ \in R \ {0, 1}

), we first mention the following straightforward facts: (i) if

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI}

, then

P_{A, n}

and

P_{H, n}

are equivalent (i.e.,

P_{A, n} \sim P_{H, n}

), as well as (ii) if

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP}

, then

P_{A, n}

and

P_{H, n}

are equivalent (i.e.,

P_{A, n} \sim P_{H, n}

). Moreover, by recalling

Z_{0} = 1

and using the “rate functions”

f_{•} (x) = β_{•} x + α_{•}

(

x \in [0, \infty [

), a version of (13) can be easily determined by calculating for each

\vec{x} : = (x_{0}, x_{1}, x_{2}, \dots) \in Ω : = N \times N_{0} \times N_{0} \times \dots

\begin{matrix} Z_{n} (\vec{x}) = \prod_{k = 1}^{n} Z_{n, k} (\vec{x}) & with Z_{n, k} (\vec{x}) : = exp \{- (f_{A} (x_{k - 1}) - f_{H} (x_{k - 1}))\} {[\frac{f_{A} (x_{k - 1})}{f_{H} (x_{k - 1})}]}^{x_{k}}, \end{matrix}

where for the last term we use the convention

{(\frac{0}{0})}^{x} = 1

for all

x \in N_{0}

. Furthermore, we define for each

\vec{x} \in Ω

Z_{n, k}^{(λ)} (\vec{x}) : = exp \{- (λ f_{A} (x_{k - 1}) + (1 - λ) f_{H} (x_{k - 1}))\} \frac{{[{(f_{A} (x_{k - 1}))}^{λ} {(f_{H} (x_{k - 1}))}^{1 - λ}]}^{x_{k}}}{x_{k}!}

(27)

with the convention

\frac{{(0)}^{0}}{0!} = 1

for the last term. Accordingly, one obtains from (14) the Hellinger integral

H_{λ} (P_{A, 0} ∥ P_{H, 0}) = 1

, as well as for all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times (R \ {0, 1})

H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp \{{(f_{A} (x_{0}))}^{λ} {(f_{H} (x_{0}))}^{(1 - λ)} - (λ f_{A} (x_{0}) + (1 - λ) f_{H} (x_{0}))\}

(28)

for

x_{0} = X_{0} \in N

, and for all

n \in N \ {1}

\begin{matrix} H_{λ} (P_{A, n} ∥ P_{H, n}) = E P_{H, n} [{(Z_{n})}^{λ}] = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n} = 0}^{\infty} \prod_{k = 1}^{n} Z_{n, k}^{(λ)} (\vec{x}) \\ = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \cdot e^{- (λ f_{A} (x_{n - 1}) + (1 - λ) f_{H} (x_{n - 1}))} \sum_{x_{n} = 0}^{\infty} \frac{{[{(f_{A} (x_{n - 1}))}^{λ} {(f_{H} (x_{n - 1}))}^{1 - λ}]}^{x_{n}}}{x_{n}!} \\ = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp {{(f_{A} (x_{n - 1}))}^{λ} {(f_{H} (x_{n - 1}))}^{1 - λ} - (λ f_{A} (x_{n - 1}) + (1 - λ) f_{H} (x_{n - 1}))} . \end{matrix}

(29)

From (29), one can see that a crucial role for the exact calculation (respectively the derivation of bounds) of the Hellinger integral is played by the functions defined for

x \in [0, \infty [

ϕ_{λ} (x) : = ϕ (x, β_{A}, β_{H}, α_{A}, α_{H}, λ) : = φ_{λ} (x) - f_{λ} (x), with

(30)

φ_{λ} (x) : = φ (x, β_{A}, β_{H}, α_{A}, α_{H}, λ) : = {(f_{A} (x))}^{λ} {(f_{H} (x))}^{1 - λ} and

(31)

f_{λ} (x) : = f (x, β_{A}, β_{H}, α_{A}, α_{H}, λ) : = λ f_{A} (x) + (1 - λ) f_{H} (x) = α_{λ} + β_{λ} x,

(32)

where we have used the

λ

-weighted-averages

α_{λ} : = α (α_{A}, α_{H}, λ) : = λ \cdot α_{A} + (1 - λ) \cdot α_{H} and β_{λ} : = β (β_{A}, β_{H}, λ) : = λ \cdot β_{A} + (1 - λ) \cdot β_{H} .

Since

λ

plays a special role, henceforth we typically use it as index and often omit

(β_{A}, β_{H}, α_{A}, α_{H})

. According to Lemma A1 in the Appendix A.1, it follows that for

λ \in] 0, 1 [

(respectively

λ \in R \ [0, 1]

) one gets

ϕ_{λ} (x) \leq 0

(respectively

ϕ_{λ} (x) \geq 0

) for all

x \in [0, \infty [

. Furthermore, in both cases there holds

ϕ_{λ} (x) = 0

iff

f_{A} (x) = f_{H} (x)

, i.e., for

x = x^{*} : = \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \geq 0

. This is consistent with the corresponding generally valid upper and lower bounds (cf. (9) and (11))

0 < H_{λ} (P_{A, n} ∥ P_{H, n}) \leq 1, for λ \in] 0, 1 [, 1 \leq H_{λ} (P_{A, n} ∥ P_{H, n}) \leq \infty, for λ \in R \ [0, 1]

.

As a first indication for our proposed method, let us start by illuminating the simplest case

λ \in R \ {0, 1}

and

γ : = α_{H} β_{A} - α_{A} β_{H} = 0

. This means that

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI} \cup P_{SP, 1}

, where

P_{SP, 1}

is the set of all (componentwise) strictly positive

(β_{A}, β_{H}, α_{A}, α_{H})

with

β_{A} \neq β_{H}

,

α_{A} \neq α_{H}

and

\frac{β_{A}}{β_{H}} = \frac{α_{A}}{α_{H}} \neq 1

(“the equal-fraction-case”). In this situation, all the three functions (30) to (32) are linear. Indeed,

φ_{λ} (x) = p_{λ}^{E} + q_{λ}^{E} x

(33)

with

p_{λ}^{E} : = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

(where the index E stands for exact linearity). Clearly,

q_{λ}^{E} > 0

on

P_{NI} \cup P_{SP, 1}

, as well as

p_{λ}^{E} > 0

on

P_{SP, 1}

and

p_{λ}^{E} = 0

on

P_{NI}

. Furthermore,

ϕ_{λ} (x) = r_{λ}^{E} + s_{λ}^{E} x

with

r_{λ}^{E} : = p_{λ}^{E} - α_{λ} = α_{A}^{λ} α_{H}^{1 - λ} - (λ α_{A} + (1 - λ) α_{H})

and

s_{λ}^{E} : = q_{λ}^{E} - β_{λ} = β_{A}^{λ} β_{H}^{1 - λ} - (λ β_{A} + (1 - λ) β_{H})

. Due to Lemma A1 one knows that on

P_{NI} \cup P_{SP, 1}

one gets

s_{λ}^{E} < 0

for

λ \in] 0, 1 [

and

s_{λ}^{E} > 0

for

λ \in R \ [0, 1]

. Furthermore, on

P_{SP, 1}

one gets

r_{λ}^{E} < 0

(resp.

r_{λ}^{E} > 0

) for

λ \in] 0, 1 [

(resp.

λ \in R \ [0, 1]

), whereas on

P_{NI}

, the no-immigration setup, we get for all

λ \in R \ {0, 1}

r_{λ}^{E} = 0

.

As it will be seen later on, such kind of linearity properties are useful for the recursive handling of the Hellinger integrals. However, only on the parameter set

P_{NI} \cup P_{SP, 1}

the functions

φ_{λ}

and

ϕ_{λ}

are linear. Hence, in the general case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times R \ {0, 1}

we aim for linear lower and upper bounds

φ_{λ}^{L} (x) : = p_{λ}^{L} + q_{λ}^{L} x \leq φ_{λ} (x) \leq φ_{λ}^{U} (x) : = p_{λ}^{U} + q_{λ}^{U} x,

(34)

x \in [0, \infty [

(ultimately,

x \in N_{0}

), which by (30) and (31) leads to

ϕ_{λ} (x) \{\begin{matrix} \leq & ϕ_{λ}^{U} (x) : = r_{λ}^{U} + s_{λ}^{U} \cdot x : = (p_{λ}^{U} - α_{λ}) + (q_{λ}^{U} - β_{λ}) \cdot x, \\ \geq & ϕ_{λ}^{L} (x) : = r_{λ}^{L} + s_{λ}^{L} \cdot x : = (p_{λ}^{L} - α_{λ}) + (q_{λ}^{L} - β_{λ}) \cdot x, \end{matrix}

(35)

x \in [0, \infty [

(ultimately,

x \in N_{0}

). Of course, the involved slopes and intercepts should satisfy reasonable restrictions. Later on, we shall impose further restrictions on the involved slopes and intercepts, in order to guarantee nice properties of the general Hellinger integral bounds given in Theorem 1 below (for instance, in consistency with the nonnegativity of

φ_{λ}

we could require

p_{λ}^{U} \geq p_{λ}^{L} \geq 0

,

q_{λ}^{U} \geq q_{λ}^{L} \geq 0

which nontrivially implies that these bounds possess certain monotonicity properties). For the formulation of our first assertions on Hellinger integrals, we make use of the following notation:

Definition 3.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times R \ {0, 1}

and all

p, q \in R

let us define the sequences

{(a_{n}^{(q)})}_{n \in N_{0}}

and

{(b_{n}^{(p, q)})}_{n \in N_{0}}

recursively by

\begin{matrix} a_{0}^{(q)} : = 0 & ; & a_{n}^{(q)} : = ξ_{λ}^{(q)} (a_{n - 1}^{(q)}) : = q \cdot e^{a_{n - 1}^{(q)}} - β_{λ}, n \in N, \end{matrix}

(36)

\begin{matrix} b_{0}^{(p, q)} : = 0 & ; & b_{n}^{(p, q)} : = p \cdot e^{a_{n - 1}^{(q)}} - α_{λ}, n \in N . \end{matrix}

(37)

Notice the interrelation

a_{1}^{(q_{λ}^{A})} = s_{λ}^{A}

and

b_{1}^{(p_{λ}^{A}, q_{λ}^{A})} = r_{λ}^{A}

for

A \in {E, L, U}

. Clearly, for all

q \in R \ {0}

and

p \in R

one has the linear interrelation

b_{n}^{(p, q)} = \frac{p}{q} a_{n}^{(q)} + \frac{p}{q} β_{λ} - α_{λ}, n \in N .

(38)

Accordingly, we obtain fundamental Hellinger integral evaluations:

Theorem 1.

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times R \ {0, 1}$ , all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ one can recursively compute the exact value

$H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{a_{n}^{(q_{λ}^{E})} X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\} = : V_{λ, X_{0}, n},$

(39)

where $\frac{α_{A}}{β_{A}}$ can be equivalently replaced by $\frac{α_{H}}{β_{H}}$ . Recall that $q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}$ . Notice that on $P_{NI} \times (R \ {0, 1})$ the formula (39) simplifies significantly, since $α_{A} = α_{H} = 0$ .
(b): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ {0, 1})$ , all coefficients $p_{λ}^{L}, p_{λ}^{U}, q_{λ}^{L}, q_{λ}^{U} \in R$ which satisfy (35) for all $x \in N_{0}$ (and thus in particular $p_{λ}^{L} \leq p_{λ}^{U}, q_{λ}^{L} \leq q_{λ}^{U}$ ), all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ one gets the following recursive (i.e., recursively computable) bounds for the Hellinger integrals:

$for λ \in] 0, 1 [: B_{λ, X_{0}, n}^{L} : = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} < H_{λ} (P_{A, n} ∥ P_{H, n}) \leq min \{{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}, 1\} = : B_{λ, X_{0}, n}^{U},$

(40)

$for λ \in R \ [0, 1] : B_{λ, X_{0}, n}^{L} : = max \{{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})}, 1\} \leq H_{λ} (P_{A, n} ∥ P_{H, n}) < {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} = : B_{λ, X_{0}, n}^{U},$

(41)

where for general $λ \in R \ {0, 1}$ , $p \in R, q \in R \ {0}$ we use the definitions

${\tilde{B}}_{λ, X_{0}, n}^{(p, q)} : = exp \{a_{n}^{(q)} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p, q)}\} = exp \{a_{n}^{(q)} \cdot X_{0} + \frac{p}{q} \sum_{k = 1}^{n} a_{k}^{(q)} + n \cdot (\frac{p}{q} β_{λ} - α_{λ})\},$

(42)

as well as

${\tilde{B}}_{λ, X_{0}, n}^{(p, 0)} : = exp \{- β_{λ} \cdot X_{0} + (p \cdot e^{- β_{λ}} - α_{λ}) \cdot n\} .$

Remark 1.

(a): Notice that the expression ${\tilde{B}}_{λ, X_{0}, n}^{(p, q)}$ can analogously be defined on the parameter set $P_{NI} \cup P_{SP, 1}$ . For the choices $q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ} > 0$ and $p_{λ}^{E} : = α_{A}^{λ} α_{H}^{1 - λ} = q_{λ}^{E} \cdot \frac{α_{A}}{β_{A}} = q_{λ}^{E} \cdot \frac{α_{H}}{β_{H}} \geq 0$ one gets $(p_{λ}^{E} / q_{λ}^{E}) \cdot β_{λ} - α_{λ} = 0$ , and thus the characterization ${\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E})} = V_{λ, X_{0}, n}$ as the exact value (rather than a lower/upper bound (component)).
(b): In the case $q = β_{λ}$ one gets the explicit representation ${\tilde{B}}_{λ, X_{0}, n}^{(p, q)} = exp \{(p - α_{λ}) \cdot n\}$ .
(c): Using the skew symmetry (8), one can derive alternative bounds of the Hellinger integral by switching to the transformed parameter setup $(\overset{\leftrightarrow}{β_{A}}, \overset{\leftrightarrow}{β_{H}}, \overset{\leftrightarrow}{α_{A}}, \overset{\leftrightarrow}{α_{H}}, \overset{\leftrightarrow}{λ}) : = (β_{H}, β_{A}, α_{H}, α_{A}, 1 - λ)$ . However, this does not lead to different bounds: define ${\overset{\leftrightarrow}{ϕ}}_{\overset{\leftrightarrow}{λ}}$ , ${\overset{\leftrightarrow}{φ}}_{\overset{\leftrightarrow}{λ}}$ and ${\overset{\leftrightarrow}{f}}_{\overset{\leftrightarrow}{λ}}$ analogously to (30), (31) and (32) by replacing the parameters $(β_{A}, β_{H}, α_{A}, α_{H}, λ)$ with $(\overset{\leftrightarrow}{β_{A}}, \overset{\leftrightarrow}{β_{H}}, \overset{\leftrightarrow}{α_{A}}, \overset{\leftrightarrow}{α_{H}}, \overset{\leftrightarrow}{λ})$ . Then, there holds ${\overset{\leftrightarrow}{f}}_{\overset{\leftrightarrow}{λ}} (x) = f_{λ} (x), {\overset{\leftrightarrow}{φ}}_{\overset{\leftrightarrow}{λ}} (x) = φ_{λ} (x)$ and ${\overset{\leftrightarrow}{ϕ}}_{\overset{\leftrightarrow}{λ}} (x) = ϕ_{λ} (x)$ , and the set of (lower and upper bound) parameters $p_{λ}^{L}, q_{λ}^{L}, p_{λ}^{U}, q_{λ}^{U}$ satisfying (35) does not change under this transformation.
(d): If there are no other restrictions on $p_{λ}^{L}, p_{λ}^{U}, q_{λ}^{L}, q_{λ}^{U}$ than (35), the bounds in (40) and (41) can have some inconvenient features, e.g., being 1 for all (large enough) $n \in N$ , having oscillating n-behaviour, being suboptimal in certain (other) senses. For a detailed discussion, the reader is referred to Section 3.16 ff. below.
(e): For the (to our context) incompatible setup of GWI with Poisson offspring but nonstochastic immigration of constant value 1, the exact values of the corresponding Hellinger integrals (i.e., an “analogue” of part (a)) was established in Linkov & Lunyova [53].

Proof of Theorem 1.

Let us fix

(β_{A}, β_{H}, α_{A}, α_{H}) \in P

as well as

x_{0} : = X_{0} \in N

, and start with arbitrary

λ \in] 0, 1 [

. We first prove the upper bound

B_{λ, X_{0}, n}^{U}

of part (b). Correspondingly, we suppose that the coefficients

p_{λ}^{U}

,

q_{λ}^{U}

satisfy (35) for all

x \in N_{0}

. From (28), (30), (31), (32) and (35) one gets immediately

B_{λ, X_{0}, 1}^{U}

in terms of the first sequence-element

a_{1}^{(q_{λ}^{U})}

(cf. (36)). With the help of (29) for all observation horizons

n \in N \ {1}

we get (with the obvious shortcut for

n = 2

)

\begin{matrix} H_{λ} (P_{A, n} ∥ P_{H, n}) = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{φ_{λ} (x_{n - 1}) - f_{λ} (x_{n - 1})\} \\ < \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{(p_{λ}^{U} - α_{λ}) + (q_{λ}^{U} - β_{λ}) x_{n - 1}\} \\ = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{b_{1}^{(p_{λ}^{U}, q_{λ}^{U})} + a_{1}^{(q_{λ}^{U})} x_{n - 1}\} \\ = exp \{b_{1}^{(p_{λ}^{U}, q_{λ}^{U})}\} \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{exp \{a_{1}^{(q_{λ}^{U})}\} φ_{λ} (x_{n - 2}) - f_{λ} (x_{n - 2})\} \\ < exp \{b_{1}^{(p_{λ}^{U}, q_{λ}^{U})}\} \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \\ \cdot exp \{(exp \{a_{1}^{(q_{λ}^{U})}\} p_{λ}^{U} - α_{λ}) + (exp \{a_{1}^{(q_{λ}^{U})}\} q_{λ}^{U} - β_{λ}) \cdot x_{n - 2}\} \\ < exp \{b_{1}^{(p_{λ}^{U}, q_{λ}^{U})}\} \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{b_{2}^{(p_{λ}^{U}, q_{λ}^{U})} + a_{2}^{(q_{λ}^{U})} x_{n - 2}\} \\ < \dots < exp \{a_{n}^{(q_{λ}^{U})} x_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} . \end{matrix}

(43)

Notice that for the strictness of the above inequalities we have used the fact that

ϕ_{λ} (x) < ϕ_{λ}^{U} (x)

for some (in fact, all but at most two)

x \in N_{0}

(cf. Properties 3(P19) below). Since for some admissible choices of

p_{λ}^{U}, q_{λ}^{U}

and some

n \in N

the last term in (43) can become larger than 1, one needs to take into account the cutoff-point 1 arising from (9). The lower bound

B_{λ, X_{0}, n}^{L}

of part (b), as well as the exact value of part (a) follow from (29) in an analogous manner by employing

p_{λ}^{L}, q_{λ}^{L}

and

p_{λ}^{E}, q_{λ}^{E}

respectively. Furthermore, we use the fact that for

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times] 0, 1 [

one gets from (38) the relation

b_{n}^{(p_{λ}^{E}, q_{λ}^{E})} = \frac{α_{A}}{β_{A}} a_{n}^{(q_{λ}^{E})}

. For the sake of brevity, the corresponding straightforward details are omitted here. Although we take the minimum of the upper bound derived in (43) and 1, the inequality

B_{λ, X_{0}, n}^{L} < B_{λ, X_{0}, n}^{U}

is nevertheless valid: the reason is that for constituting a lower bound, the parameters

p_{λ}^{L}, q_{λ}^{L}

must fulfill either the conditions

[p_{λ}^{L} - α_{λ} < 0

and

q_{λ}^{L} - β_{λ} \leq 0]

or

[p_{λ}^{L} - α_{λ} \leq 0

and

q_{λ}^{L} - β_{λ} < 0]

(or both), which guarantees that

B_{λ, X_{0}, n}^{L} < 1

. The proof for all

λ \in R \ [0, 1]

works out completely analogous, by taking into account the generally valid lower bound

H_{λ} (P_{A, n} ∥ P_{H, n}) \geq 1

(cf. (11)). □

3.2. Some Useful Facts for Deeper Analyses

Theorem 1(b) and Remark 1(a) indicate the crucial role of the expression

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

and that the choice of the quantities

p, q

depends on the underlying (e.g., fixed) offspring-immigration parameter constellation

(β_{A}, β_{H}, α_{A}, α_{H})

as well as on the (e.g., selectable) value of

λ

, i.e.,

p_{λ}^{A} = p^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q_{λ}^{A} = q^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

with

A \in {E, L, U}

. In order to study the desired time-behaviour

n \mapsto {\tilde{B}}_{λ, X_{0}, n}^{(\cdot, \cdot)}

of the Hellinger integral bounds resp. exact values, one therefore faces a six-dimensional (and thus highly non-obvious) detailed analysis, including the search for criteria (in addition to (35)) on good/optimal choices of

p_{λ}^{L}, q_{λ}^{L}, p_{λ}^{U}, q_{λ}^{U}

. Since these criteria will (almost) always imply the nonnegativity of

p_{λ}^{A}, q_{λ}^{A}

(

A \in {L, U}

) and

p_{λ}^{E} \geq 0, q_{λ}^{E} > 0

(cf. Remark 1(a)), let us first present some fundamental properties of the underlying crucial sequences

{(a_{n}^{(q)})}_{n \in N}

and

{(b_{n}^{(p, q)})}_{n \in N}

for general

p \geq 0, q \geq 0

.

Properties 1.

For all

λ \in R

the following holds:

(P1)

If

0 < q < β_{λ}

, then the sequence

{(a_{n}^{(q)})}_{n \in N}

is strictly negative, strictly decreasing and converges to the unique negative solution

x_{0}^{(q)} \in] - β_{λ}, q - β_{λ} [

of the equation

ξ_{λ}^{(q)} (x) = q \cdot e^{x} - β_{λ} = x .

(44)

(P2)

If

0 < q = β_{λ}

, then

a_{n}^{(q)} \equiv 0

.

(P3)

If

q > max {0, β_{λ}}

, then the sequence

{(a_{n}^{(q)})}_{n \in N}

is strictly positive and strictly increasing. Notice that in this setup,

q = 1

implies

min {1, e^{β_{λ} - 1}} = e^{β_{λ} - 1} < q

.

(P3a): If additionally $q \leq min \{1, e^{β_{λ} - 1}\}$ , then the sequence ${(a_{n}^{(q)})}_{n \in N}$ converges to the smallest positive solution $x_{0}^{(q)} \in] 0, - log q]$ of the Equation (44).
(P3b): If additionally $q > min \{1, e^{β_{λ} - 1}\}$ , then the sequence ${(a_{n}^{(q)})}_{n \in N}$ diverges to ∞, faster than exponentially (i.e., there do not exist constants $c_{1}, c_{2} \in R$ such that $a_{n}^{(q)} \leq e^{c_{1} + c_{2} n}$ for all $n \in N$ ).

(P4)

If

q = 0

, then one gets

a_{n}^{(0)} \equiv - β_{λ}

.

Due to the linear interrelation (38), these results directly carry over to the behaviour of the sequence

{(b_{n}^{(p, q)})}_{n \in N}

:

(P5)

If

p > 0

and

0 < q < β_{λ}

, then the sequence

{(b_{n}^{(p, q)})}_{n \in N}

is strictly decreasing and converges to

p \cdot e^{x_{0}^{(q)}} - α_{λ}

. Trivially,

b_{1}^{(p, q)} = p - α_{λ}

.

(P5a): If additionally $p < α_{λ}$ , then ${(b_{n}^{(p, q)})}_{n \in N}$ is strictly negative for all $n \in N$ .
(P5b): If additionally $p = α_{λ}$ , then ${(b_{n}^{(p, q)})}_{n \in N}$ is strictly negative for all $n \in N \ {1}$ .
(P5c): If additionally $p > α_{λ}$ , then ${(b_{n}^{(p, q)})}_{n \in N}$ is strictly positive for some (and possibly for all) $n \in N$ .

(P6)

If

0 < q = β_{λ}

, then

b_{n}^{(p, q)} \equiv p - α_{λ}

.

(P7)

If

p > 0

and

q > max {0, β_{λ}}

, then the sequence

{(b_{n}^{(p, q)})}_{n \in N}

is strictly increasing.

(P7a): If additionally $q \leq min \{1, e^{β_{λ} - 1}\}$ , then the sequence ${(b_{n}^{(p, q)})}_{n \in N}$ converges to $p \cdot e^{x_{0}^{(q)}} - α_{λ} \in] p - α_{λ}, p / q - α_{λ}]$ ; this limit can take any sign, depending on the parameter constellation.
(P7b): If additionally $q > min \{1, e^{β_{λ} - 1}\}$ , then the sequence ${(b_{n}^{(p, q)})}_{n \in N}$ diverges to ∞, faster than exponentially.

(P8)

For the remaining cases we get:

b_{n}^{(0, q)} \equiv - α_{λ}

and

b_{n}^{(p, 0)} \equiv p \cdot e^{- β_{λ}} - α_{λ}

(

p \in R, q \in R

).Moreover, in our investigations we will repeatedly make use of the function

ξ_{λ}^{(q)} (\cdot)

from the definition (36) of

a_{n}^{(q)}

(see also (44)), which has the following properties:

(P9)

For

q \in] 0, \infty [

and all

λ \in R \ {0, 1}

the function

ξ_{λ}^{(q)} (\cdot)

is strictly increasing, strictly convex and smooth, and there holds

(P9a): $ξ_{λ}^{(q)} (0) \{\begin{matrix} < 0, & if q < β_{λ}, \\ = 0, & if q = β_{λ}, \\ > 0, & if q > β_{λ} . \end{matrix}$
(P9b): $lim_{x \to - \infty} ξ_{λ}^{(q)} (x) = - β_{λ}, and lim_{x \to \infty} ξ_{λ}^{(q)} (x) = \infty .$

The proof of these properties is provided in Appendix A.1. From Properties 1 (P1) to (P4) we can see, that the behaviour of the sequence

{(a_{n}^{(q)})}_{n \in N}

can be classified basically into four different types; besides the case (P2) where

a_{n}^{(q)}

is constant, the sequence can be either (i) strictly decreasing and convergent (e.g., for the NI case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (0.5, 2, 0, 0, 0.5)

leading to

β_{λ} = λ β_{A} + (1 - λ) β_{H} = 1.25

and to

q : = q_{λ}^{E} = β_{A}^{λ} β_{H}^{1 - λ} = 1

, cf. (33) resp. Theorem 1(a)), or (ii) strictly increasing and convergent (e.g., for

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (0.5, 2, 0, 0, 1.5)

leading to

β_{λ} = - 0.25

,

q : = q_{λ}^{E} = 0.25

), or (iii) strictly increasing and divergent (e.g., for

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (0.5, 2, 0, 0, 2.7)

leading to

β_{λ} = - 2.05

,

q : = q_{λ}^{E} \approx 0.047366

). Within our running-example epidemiological context of Section 2.3, this corresponds to a “potentially dangerous” infectious-disease-transmission situation

(H)

(with supercritical reproduction number

β_{H} = 2

), whereas

(A)

describes a “mild” situation (with “low” subcritical

β_{A} = 0.5

).

As already mentioned before, the sequences

{(a_{n}^{(q)})}_{n \in N}

and

{(b_{n}^{(p, q)})}_{n \in N}

–whose behaviours for general

p \geq 0

and

q \geq 0

were described by the Properties 1–have to be evaluated at setup-dependent choices

p = p_{λ} = p (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q = q_{λ} = q (β_{A}, β_{H}, α_{A}, α_{H}, λ)

. Hence, for fixed

(β_{A}, β_{H}, α_{A}, α_{H})

, one of the questions–which arises in the course of the desired investigations of the time-behaviour of the Hellinger integral bounds (resp. exact values)–is for which

λ \in R

the sequence

{(a_{n}^{(q_{λ})})}_{n \in N}

converges. In the following, we illuminate this for the important special case

q_{λ} = β_{A}^{λ} β_{H}^{1 - λ}

. Suppose at first that

β_{A} \neq β_{H}

. Properties 1 (P1) implies that for

λ \in] 0, 1 [

one has

{lim}_{n \to \infty} a_{n}^{(q_{λ})} = x_{0}^{(q_{λ})} \in] - β_{λ}, q_{λ} - β_{λ} [

, and Lemma A1 states that

q_{λ} - β_{λ} < 0

. For

λ \in R \ [0, 1]

, there holds

q_{λ} > max {0, β_{λ}}

, and from (P3) one can see that

{(a_{n}^{(q_{λ})})}_{n \in N}

does not converge to

x_{0}^{(q_{λ})}

in general, but for

q_{λ} \leq min {1, e^{β_{λ} - 1}}

which constitutes an implicit condition on

λ

. This can be made explicit, with the help of the auxiliary variables

\begin{matrix} λ_{-} : = λ_{-} (β_{A}, β_{H}) : = \{\begin{matrix} inf \{λ \leq 0 : β_{A}^{λ} β_{H}^{1 - λ} \leq min \{1, exp {λ β_{A} + (1 - λ) β_{H} - 1}\}\}, \\ in case that the set is nonempty, \\ 0, else, \end{matrix} \\ λ_{+} : = λ_{+} (β_{A}, β_{H}) : = \{\begin{matrix} sup \{λ \geq 1 : β_{A}^{λ} β_{H}^{1 - λ} \leq min \{1, exp {λ β_{A} + (1 - λ) β_{H} - 1}\}\}, \\ in case that the set is nonempty, \\ 1, else . \end{matrix} \end{matrix}

For the constellation

β_{A} = β_{H} > 0

we clearly obtain

q_{λ} = β_{A}^{λ} β_{H}^{1 - λ} = β_{A} = β_{H} = β_{λ}

. Hence, (P2) implies that the sequence

{(a_{n}^{(q_{λ})})}_{n \in N}

converges for all

λ \in R \ {0, 1}

and we can set

λ_{-} : = - \infty

as well as

λ_{+} : = \infty

. Incorporating this and by adapting a result of Linkov & Lunyova [53] on

λ_{-} (v_{1}, v_{2}), λ_{+} (v_{1}, v_{2})

for

β_{A} \neq β_{H}

, we end up with

Lemma 1.

(a) For all

β_{A} > 0, β_{H} > 0

with

β_{A} \neq β_{H}

there holds

λ_{-} = λ_{-} (β_{A}, β_{H}) = \{\begin{matrix} 0, & if β_{H} \geq 1, \\ \overset{˘}{λ}, & if β_{H} < 1 and β_{A} \notin [β_{H}, β_{H} z (β_{H})], \\ - \infty, & if β_{H} < 1 and β_{A} \in] β_{H}, β_{H} z (β_{H})], \end{matrix}

λ_{+} = λ_{+} (β_{A}, β_{H}) = \{\begin{matrix} 1, & if β_{A} \geq 1, \\ \overset{˘}{λ}, & if β_{A} < 1 and β_{H} \notin [β_{A}, β_{A} z (β_{A})], \\ \infty, & if β_{A} < 1 and β_{H} \in] β_{A}, β_{A} z (β_{A})], \end{matrix}

where

\overset{˘}{λ} : = \overset{˘}{λ} (β_{A}, β_{H}) : = \frac{β_{H} - 1 - log (β_{H})}{β_{H} - β_{A} + log (\frac{β_{A}}{β_{H}})} \{\begin{matrix} < 0, & if β_{H} < 1 and β_{A} \notin [β_{H}, β_{H} z (β_{H})], \\ > 1, & if β_{A} < 1 and β_{H} \notin [β_{A}, β_{A} z (β_{A})] . \end{matrix}

Here, for fixed

β \in] 0, \infty [\ {1}

we denote by

z (β)

the unique solution of the equation

log (x) - β (x - 1) = 0

,

x \in] 0, \infty [\ {1}

. For

β = 1

,

z (β) = 1

denotes the unique solution of

log (x) - (x - 1) = 0, x \in] 0, \infty [

.(b) For all

β_{A} = β_{H} > 0

one gets

λ_{-} = λ_{-} (β_{A}, β_{H}) = - \infty

as well as

λ_{+} = λ_{+} (β_{A}, β_{H}) = \infty

.Notice that the relationship

\overset{˘}{λ} (β_{A}, β_{H}) = 1 - \overset{˘}{λ} (β_{H}, β_{A})

is consistent with the skew symmetry (8).

A corresponding proof is given in Appendix A.1.

With these auxiliary basic facts in hand, let us now work out our detailed investigations of the time-behaviour

n \mapsto H_{λ} (P_{A, n} ∥ P_{H, n})

, where we start with the exactly treatable case (a) in Theorem 1.

3.3. Detailed Analyses of the Exact Recursive Values, i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI} \cup P_{SP, 1}$

In the no-immigration-case

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI}

and in the equal-fraction-case

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP, 1}

, the Hellinger integral can be calculated exactly in terms of

H_{λ} (P_{A, n} ∥ P_{H, n}) = V_{λ, X_{0}, n}

(cf. (39)), as proposed in part (a) of Theorem 1. This quantity depends on the behaviour of the sequence

{(a_{n}^{(q_{λ}^{E})})}_{n \in N}

, with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ} > 0

, and of the sum

{(\frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})})}_{n \in N}

. The last expression is equal to zero on

P_{NI}

. On

P_{SP, 1}

, this sum is unequal to zero. Using Lemma A1 we conclude that

q_{λ}^{E} < β_{λ}

(resp.

q_{λ}^{E} > β_{λ}

) iff

λ \in] 0, 1 [

(resp.

λ \in R \ [0, 1]

), since on

P_{NI} \cup P_{SP, 1}

there holds

β_{A} \neq β_{H}

. Thus, from Properties 1 (P1) we can see that the sequence

{(a_{n}^{(q_{λ}^{E})})}_{n \in N}

is strictly negative, strictly decreasing and it converges to the unique solution

x_{0}^{(q_{λ}^{E})} \in] - β_{λ}, q_{λ}^{E} - β_{λ} [

of the Equation (44) if

λ \in] 0, 1 [

. For

λ \in R \ [0, 1]

, (P3) implies that the sequence

{(a_{n}^{(q_{λ}^{E})})}_{n \in N}

is strictly positive, strictly increasing and converges to the smallest positive solution

x_{0}^{(q_{λ}^{E})} \in] 0, - log (q_{λ}^{E})]

of the Equation (44) in case that (P3a) is satisfied, otherwise it diverges to ∞. Thus, we have shown the following detailed behaviour of Hellinger integrals:

Proposition 2.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{NI} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds

\begin{matrix} (a) & H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - λ β_{A} - (1 - λ) β_{H}) X_{0}\} < 1, \\ (b) & the sequence {(H_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{a_{n}^{(q_{λ}^{E})} X_{0}\} = : V_{λ, X_{0}, n} \\ is strictly decreasing, \\ (c) & lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{x_{0}^{(q_{λ}^{E})} X_{0}\} \in] 0, 1 [, \\ (d) & lim_{n \to \infty} \frac{1}{n} log H_{λ} (P_{A, n} ∥ P_{H, n}) = 0 \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n} is strictly decreasing . \end{matrix}

Proposition 3.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{NI} \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0}\} > 1, \\ (b) & the sequence {(H_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0}\} = : V_{λ, X_{0}, n} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} exp \{x_{0}^{(q_{λ}^{E})} \cdot X_{0}\} > 1, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (d) & lim_{n \to \infty} \frac{1}{n} log H_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n} is strictly increasing . \end{matrix}

In the case

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP, 1}

, the sequence

{(a_{n}^{(q_{λ}^{E})})}_{n \in N}

under consideration is formally the same, with the parameter

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ} > 0

. However, in contrast to the case

P_{NI}

, on

P_{SP, 1}

both the sequence

{(a_{n}^{(q_{λ}^{E})})}_{n \in N}

and the sum

{(\frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})})}_{n \in N}

are strictly decreasing in case that

λ \in] 0, 1 [

, and strictly increasing in case that

λ \in R \ [0, 1]

. The respective convergence behaviours are given in Properties 1 (P1) and (P3). We thus obtain

Proposition 4.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 1} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot (X_{0} + \frac{α_{A}}{β_{A}})\} < 1, \\ (b) & the sequence {(H_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\} = : V_{λ, X_{0}, n} \\ is strictly decreasing, \\ (c) & lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0, \\ (d) & lim_{n \to \infty} \frac{1}{n} log H_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})} < 0, \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n} is strictly decreasing . \end{matrix}

Proposition 5.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 1} \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot (X_{0} + \frac{α_{A}}{β_{A}})\} > 1, \\ (b) & the sequence {(H_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ H_{λ} (P_{A, n} ∥ P_{H, n}) = exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\} = : V_{λ, X_{0}, n} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty, \\ (d) & lim_{n \to \infty} \frac{1}{n} log H_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})} > 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n} is strictly increasing . \end{matrix}

Due to the nature of the equal-fraction-case

P_{SP, 1}

, in the assertions (a), (b), (d) of the Propositions 4 and 5, the fraction

α_{A} / β_{A}

can be equivalently replaced by

α_{H} / β_{H}

.

Remark 2.

For the (to our context) incompatible setup of GWI with Poisson offspring but nonstochastic immigration of constant value 1, an “analogue” of part (d) of the Propositions 4 resp. 5 was established in Linkov & Lunyova [53].

3.4. Some Preparatory Basic Facts for the Remaining Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$

The bounds

B_{λ, X_{0}, n}^{L}, B_{λ, X_{0}, n}^{U}

for the Hellinger integral introduced in formula (40) in Theorem 1 can be chosen arbitrarily from a

(p_{λ}^{L}, q_{λ}^{L}, p_{λ}^{U}, q_{λ}^{U})

-indexed set of context-specific parameters satisfying (34), or equivalently (35).

In order to derive bounds which are optimal, with respect to goals that will be discussed later, the following monotonicity properties of the sequences

{(a_{n}^{(q)})}_{n \in N}

and

{(b_{n}^{(p, q)})}_{n \in N}

(cf. (36), (37)) for general, context-independent parameters q and p, will turn out to be very useful:

Properties 2.

(P10)

For

0 \leq q_{1} < q_{2} < \infty

there holds

a_{n}^{(q_{1})} < a_{n}^{(q_{2})}

for all

n \in N

.

(P11)

For each fixed

q \geq 0

and

0 \leq p_{1} < p_{2} < \infty

there holds

b_{n}^{(p_{1}, q)} < b_{n}^{(p_{2}, q)}

, for all

n \in N

.

(P12)

For fixed

p > 0

and

0 \leq q_{1} < q_{2}

it follows

b_{n}^{(p, q_{1})} < b_{n}^{(p, q_{2})}

for all

n \in N

.

(P13)

Suppose that

0 \leq p_{1} < p_{2}

and

0 \leq q_{2} < q_{1}

. For fixed

n \in N

, no dominance assertion can be conjectured for

b_{n}^{(p_{1}, q_{1})}, b_{n}^{(p_{2}, q_{2})}

. As an example, consider the setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (0.4, 0.8, 5, 3, 0.5)

; within our running-example epidemiological context of Section 2.3, this corresponds to a “nearly dangerous” infectious-disease-transmission situation

(H)

(with nearly critical reproduction number

β_{H} = 0.8

and importation mean of

α_{H} = 3

), whereas

(A)

describes a “mild” situation (with “low” subcritical

β_{A} = 0.4

and

α_{A} = 5

). On the nonnegative real line, the function

ϕ_{λ} (x)

can be bounded from above by the linear functions

ϕ_{λ}^{U, 1} (x) : = p_{1} + q_{1} x : = 4.040 + 0.593 \cdot x

as well as by

ϕ_{λ}^{U, 2} (x) : = p_{2} + q_{2} x : = 4.110 + 0.584 \cdot x

. Clearly,

p_{1} < p_{2}

and

q_{1} > q_{2}

. Let us show the first eight elements and the respective limits of the corresponding sequences

b_{n}^{(p_{1}, q_{1})}, b_{n}^{(p_{2}, q_{2})}

:

n	1	2	3	4	5	6	7	8	⋯	∞
$b_{n}^{(p_{1}, q_{1})}$	0.040	0.011	−0.005	−0.015	−0.021	−0.024	−0.026	−0.028	⋯	−0.029
$b_{n}^{(p_{2}, q_{2})}$	0.110	0.045	0.007	−0.014	−0.026	−0.033	−0.036	−0.039	⋯	−0.041

(P14)

For arbitrary

0 < p_{1}, p_{2}

and

0 \leq q_{1}, q_{2} \leq min {1, e^{β_{λ} - 1}}

suppose that

log (p_{1}) + x_{0}^{(q_{1})} < log (p_{2}) + x_{0}^{(q_{2})}

. Then there holds

p_{1} \cdot e^{x_{0}^{(q_{1})}} - α_{λ} = lim_{n \to \infty} \frac{1}{n} \sum_{k = 1}^{n} b_{k}^{(p_{1}, q_{1})} < lim_{n \to \infty} \frac{1}{n} \sum_{k = 1}^{n} b_{k}^{(p_{2}, q_{2})} = p_{2} \cdot e^{x_{0}^{(q_{2})}} - α_{λ} .

From (P10) to (P12) one deduces that both sequences

{(a_{n}^{(q)})}_{n \in N}

and

{(b_{n}^{(p, q)})}_{n \in N}

are monotone in the general parameters

p, q \geq 0

. Thus, for the upper bound of the Hellinger integral

B_{λ, X_{0}, n}^{U}

we should use nonnegative context-specific parameters

p_{λ}^{U} = p^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q_{λ}^{U} = q^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

which are as small as possible, and for the lower bound

B_{λ, X_{0}, n}^{L}

we should use nonnegative context-specific parameters

p_{λ}^{L} = p^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q_{λ}^{L} = q^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

which are as large as possible, of course, subject to the (equivalent) restrictions (34) and (35).

To find “optimal” parameter pairs, we have to study the following properties of the function

ϕ_{λ} (\cdot) = ϕ (\cdot, β_{A}, β_{H}, α_{A}, α_{H}, λ)

defined on

[0, \infty [

in (30) (which are also valid for the previous parameter context

(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})

):

Properties 3.

(P15): One has

$ϕ_{λ} (x) = {(α_{A} + β_{A} x)}^{λ} {(α_{H} + β_{H} x)}^{1 - λ} - λ (α_{A} + β_{A} x) + (1 - λ) (α_{H} + β_{H} x) \{\begin{matrix} \leq 0, & if λ \in] 0, 1 [, \\ \geq 0, & if λ \in R \ [0, 1], \end{matrix}$

where equality holds iff $f_{A} (x) = f_{H} (x)$ for some $x \in [0, \infty [$ iff $x = x^{*} : = \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in [0, \infty [$ .
(P16): There holds

$ϕ_{λ} (0) = α_{A}^{λ} α_{H}^{1 - λ} - α_{λ} \{\begin{matrix} \leq 0, & if λ \in] 0, 1 [, \\ \geq 0, & if λ \in R \ [0, 1], \end{matrix}$

with equality iff $α_{A} = α_{H}$ together with $β_{A} \neq β_{H}$ (cf. Lemma A1).
(P17): For all $λ \in R \ {0, 1}$ one gets

$ϕ_{λ}^{'} (x) = λ β_{A} {(f_{A} (x))}^{λ - 1} {(f_{H} (x))}^{1 - λ} + (1 - λ) β_{H} {(f_{A} (x))}^{λ} {(f_{H} (x))}^{- λ} - β_{λ} .$
(P18): There holds

$lim_{x \to \infty} ϕ_{λ}^{'} (x) = β_{A}^{λ} β_{H}^{1 - λ} - β_{λ} \{\begin{matrix} \leq 0, & if λ \in] 0, 1 [, \\ \geq 0, & if λ \in R \ [0, 1], \end{matrix}$

with equality iff $β_{A} = β_{H}$ together with $α_{A} \neq α_{H}$ (cf. Lemma A1).
(P19): There holds

$ϕ_{λ}^{''} (x) = - λ (1 - λ) {(f_{A} (x))}^{λ - 2} {(f_{H} (x))}^{- λ - 1} {(α_{A} β_{H} - α_{H} β_{A})}^{2} \{\begin{matrix} \leq 0, & if λ \in] 0, 1 [, \\ \geq 0, & if λ \in R \ [0, 1], \end{matrix}$

with equality iff $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})$ . Hence, for $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$ , the function $ϕ_{λ}$ is strictly concave (convex) for $λ \in] 0, 1 [$ ( $λ \in R \ [0, 1]$ ). Notice that $ϕ_{λ}^{'} (0) = λ β_{A} {(\frac{α_{A}}{α_{H}})}^{λ - 1} + (1 - λ) β_{H} {(\frac{α_{A}}{α_{H}})}^{λ} - β_{λ}$ can be either negative (e.g., for the setup $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in {(4, 2, 3, 1, 0.5)$ , $(4, 2, 5, 1, 2)}$ , or zero (e.g., for $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in \{(4, 2, 4, 1, 0.5), (4, 2, 3, 1, 2)\}$ ), or positive (e.g., for $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in {(4, 2, 5, 1, 0.5)$ , $(4, 2, 2, 1, 2)})$ , where the exemplary parameter constellations have concrete interpretations in our running-example epidemiological context of Section 2.3. Accordingly, for $λ \in] 0, 1 [$ , due to concavity and (P17), the function $ϕ_{λ} (\cdot)$ can be either strictly decreasing, or can obtain its global maximum in $] 0, \infty [$ , or–only in the case $β_{A} = β_{H}$ —can be strictly increasing. Analogously, for $λ \in R \ [0, 1]$ , the function $ϕ_{λ} (\cdot)$ can be either strictly increasing, or can obtain its global minimum in $] 0, \infty [$ , or–only in the case $β_{A} = β_{H}$ —can be strictly decreasing.
(P20): For all $λ \in R \ {0, 1}$ one has

$\begin{matrix} lim_{x \to \infty} (ϕ_{λ} (x) - (\tilde{r_{λ}} + \tilde{s_{λ}} x)) = 0, \\ for \tilde{r_{λ}} : = \tilde{p_{λ}} - α_{λ} : = λ α_{A} {(\frac{β_{A}}{β_{H}})}^{λ - 1} + (1 - λ) α_{H} {(\frac{β_{A}}{β_{H}})}^{λ} - α_{λ} \\ and \tilde{s_{λ}} : = \tilde{q_{λ}} - β_{λ} : = β_{A}^{λ} β_{H}^{1 - λ} - β_{λ} . \end{matrix}$

The linear function $\tilde{ϕ_{λ}} (x) : = \tilde{r_{λ}} + \tilde{s_{λ}} \cdot x$ constitutes the asymptote of $ϕ_{λ} (\cdot)$ . Notice that if $β_{A} = β_{H}$ one has ${\tilde{s}}_{λ} = 0 = {\tilde{r}}_{λ}$ ; if $β_{A} \neq β_{H}$ we have ${\tilde{s}}_{λ} < 0$ in the case $λ \in] 0, 1 [$ and ${\tilde{s}}_{λ} > 0$ if $λ \in R \ [0, 1]$ . Furthermore, $ϕ_{λ} (0) < \tilde{r_{λ}}$ if $λ \in] 0, 1 [$ and $ϕ_{λ} (0) > \tilde{r_{λ}}$ if $λ \in R \ [0, 1]$ , (cf. Lemma A1(c1) and (c2)). If $α_{A} = α_{H}$ (and thus $β_{A} \neq β_{H}$ ), then the intercept $\tilde{r_{λ}}$ is strictly positive if $λ \in] 0, 1 [$ resp. strictly negative if $λ \in R \ [0, 1]$ . In contrast, for the case $α_{A} \neq α_{H}$ , the intercept $\tilde{r_{λ}}$ can assume any sign, take e.g., $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in {(3.7, 0.9, 2.0, 1.0, 0.5), (4, 2, 1.6, 1, 2)}$ for $\tilde{r_{λ}} > 0$ , $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in {(3.6, 0.9, 2.0, 1.0, 0.5), (4, 2, 1.5, 1, 2)}$ for $\tilde{r_{λ}} = 0$ , and $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in {(3.5, 0.9, 2.0, 1.0, 0.5), (4, 2, 1.4, 1, 2)}$ for $\tilde{r_{λ}} < 0$ ; again, the exemplary parameter constellations have concrete interpretations in our running-example epidemiological context of Section 2.3.

The properties (P15) to (P20) above describe in detail the characteristics of the function

ϕ_{λ} (\cdot) = ϕ (\cdot, β_{A}, β_{H}, α_{A}, α_{H}, λ)

. In the previous parameter setup

P_{NI} \cup P_{SP, 1}

, this function is linear, which can be seen from (P19). In the current parameter setup

P_{SP} \ P_{SP, 1}

, this function can basically be classified into four different types. From (P16) to (P20) it is easy to see that for all current parameter constellations the particular choices

p_{λ}^{A} : = α_{A}^{λ} α_{H}^{1 - λ} > 0, q_{λ}^{A} : = β_{A}^{λ} β_{H}^{1 - λ} > 0,

(45)

which correspond to the following choices in (35)

r_{λ}^{A} : = α_{A}^{λ} α_{H}^{1 - λ} - α_{λ} \leq 0 (r e s p . \geq 0), s_{λ}^{A} : = β_{A}^{λ} β_{H}^{1 - λ} - β_{λ} \leq 0 (r e s p . \geq 0),

– where

A = L

(resp.

A = U

)–lead to the tightest lower bound

B_{λ, X_{0}, n}^{L}

(resp. upper bound

B_{λ, X_{0}, n}^{U}

) for

H_{λ} (P_{A, n} ∥ P_{H, n})

in (40) in the case

λ \in] 0, 1 [

(resp.

λ \in R \ [0, 1]

). Notice that for the previous parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})

these choices led to the exact values of the Hellinger integral and to the simplification

(p_{λ}^{E} / q_{λ}^{E}) \cdot β_{λ} - α_{λ} = 0

, which implies

b_{n}^{(p_{λ}^{E}, q_{λ}^{E})} = (α_{A} / β_{A}) \cdot a_{n}^{(q_{λ}^{E})}

. In contrast, in the current parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}

we only derive the optimal lower (resp. upper) bound for

λ \in] 0, 1 [

(resp.

λ \in R \ [0, 1]

) by using the parameters

p_{λ}^{A}, q_{λ}^{A}

for

A = L

(resp.

A = U

) and

(p_{λ}^{A} / q_{λ}^{A}) \cdot β_{λ} - α_{λ} \neq 0

. For a better distinguishability and easier reference we thus stick to the

L -

notation (resp.

U -

notation) here.

3.5. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

The discussion above implies that the lower bound

B_{λ, X_{0}, n}^{L}

for the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

in (40) is optimal for the choices

p_{λ}^{L}, q_{λ}^{L} > 0

defined in (45). If

β_{A} \neq β_{H}

, due to Properties 1 (P1) and Lemma A1, the sequence

{(a_{n}^{(q_{λ}^{L})})}_{n \in N}

is strictly negative and strictly decreasing and converges to the unique negative solution of the Equation (44). Furthermore, due to (P5), the sequence

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{L})})}_{n \in N}

, as defined in (37), is strictly decreasing. Since

b_{1}^{(p_{λ}^{L}, q_{λ}^{L})} = p_{λ}^{L} - α_{λ} \leq 0

by Lemma A1, with equality iff

α_{A} = α_{H}

, the sequence

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{L})})}_{n \in N}

is also strictly negative (with the exception

b_{1}^{(p_{λ}^{L}, q_{λ}^{L})} = 0

for

α_{A} = α_{H}

) and strictly decreasing. If

β_{A} = β_{H}

and thus

α_{A} \neq α_{H}

, due to (P2), (P6) and Lemma A1, there holds

a_{n}^{(q_{λ}^{L})} \equiv 0

and

b_{n}^{(q_{λ}^{L})} \equiv p_{λ}^{L} - α_{λ} < 0

. Thus, analogously to the cases

P_{NI} \cup P_{SP, 1}

we obtain

Proposition 6.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{L} : = α_{A}^{λ} α_{H}^{1 - λ}, q_{λ}^{L} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{L} = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0} + α_{A}^{λ} α_{H}^{1 - λ} - α_{λ}\} < 1, \\ (b) & the sequence of lower bounds {(B_{λ, X_{0}, n}^{L})}_{n \in N} for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{L} = exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \frac{p_{λ}^{L}}{q_{λ}^{L}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{L})} + n \cdot (\frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot β_{λ} - α_{λ})\} is strictly decreasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{L} = 0, \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{L} = \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (x_{0}^{(q_{λ}^{L})} + β_{λ}) - α_{λ} = p_{λ}^{L} \cdot e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} < 0 . \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{L} is strictly decreasing . \end{matrix}

3.6. Goals for Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

For parameter constellations

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

, in contrast to the treatment of the lower bounds (cf. the previous Section 3.5), the fine-tuning of the upper bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

is much more involved. To begin with, let us mention that the monotonicity-concerning Properties 2 (P10) to (P12) imply that for a tight upper bound

B_{λ, X_{0}, n}^{U}

(cf. (40)) one should choose parameters

p_{λ}^{U} \geq p_{λ}^{L} > 0, q_{λ}^{U} \geq q_{λ}^{L} > 0

as small as possible. Due to the concavity (cf. Properties 3 (P19)) of the function

ϕ_{λ} (\cdot)

, the linear upper bound

ϕ_{λ}^{U} (\cdot)

(on the ultimately relevant subdomain

N_{0}

) thus must hit the function

ϕ_{λ} (\cdot)

in at least one point

x \in N_{0}

, which corresponds to some “discrete tangent line” of

ϕ_{λ} (\cdot)

in x, or in at most two points

x, x + 1 \in N_{0}

, which corresponds to the secant line of

ϕ_{λ} (\cdot)

across its arguments x and

x + 1

. Accordingly, there is in general no overall best upper bound; of course, one way to obtain “good” upper bounds for

H_{λ} (P_{A, n} ∥ P_{H, n})

is to solve the optimization problem

(\bar{p_{λ}^{U}}, \bar{q_{λ}^{U}}) : = arg min_{(p_{λ}^{U}, q_{λ}^{U})} \{exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\}\},

(46)

subject to the constraint (35). However, the corresponding result generally depends on the particular choice of the initial population

X_{0} \in N

and on the observation time horizon

n \in N

. Hence, there is in general no overall optimal choice of

p_{λ}^{U}, q_{λ}^{U}

without the incorporation of further goal-dependent constraints such as

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{U} = 0

in case of

{lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0

. By the way, mainly because of the non-explicitness of the sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

(due to the generally not explicitly solvable recursion (36)) and the discreteness of the constraint (35), this optimization problem seems to be not straightforward to solve, anyway. The choice of parameters

p_{λ}^{U}, q_{λ}^{U}

for the upper bound

B_{λ, X_{0}, n}^{U} \geq H_{λ} (P_{A, n} ∥ P_{H, n})

can be made according to different, partially incompatible (“optimality-” resp. “goodness-”) criteria and goals, such as:

(G1): the validity of $B_{λ, X_{0}, n}^{U} < 1$ simultaneously for all initial configurations $X_{0} \in N$ , all observation horizons $n \in N$ and all $λ \in] 0, 1 [$ , which leads to a strict improvement of the general upper bound $H_{λ} (P_{A, n} ∥ P_{H, n}) < 1$ (cf. (9));
(G2): the determination of the long-term-limits ${lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n})$ respectively ${lim}_{n \to \infty} B_{λ, X_{0}, n}^{U}$ for all $X_{0} \in N$ and all $λ \in] 0, 1 [$ ; in particular, one would like to check whether ${lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0$ , which implies that the families of probability distributions ${(P_{A, n})}_{n \in N}$ and ${(P_{H, n})}_{n \in N}$ are asymptotically distinguishable (entirely separated), cf. (25);
(G3): the determination of the time-asymptotical growth rates ${lim}_{n \to \infty} \frac{1}{n} log (H_{λ} (P_{A, n} ∥ P_{H, n}))$ resp. ${lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{U})$ for all $X_{0} \in N$ and all $λ \in] 0, 1 [$ .

Further goals–with which we do not deal here for the sake of brevity–are for instance (i) a very good tightness of the upper bound

B_{λ, X_{0}, n}^{U}

for

n \geq N

for some fixed large

N \in N

, or (ii) the criterion (G1) with fixed (rather than arbitrary) initial population size

X_{0} \in N

.

Let us briefly discuss the three Goals (G1) to (G3) and their challenges: due to Theorem 1, Goal (G1) can only be achieved if the sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

is non-increasing, since otherwise, for each fixed observation horizon

n \in N

there is a large enough initial population size

X_{0}

such that the upper bound component

{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}

becomes larger than 1, and thus

B_{λ, X_{0}, n}^{U} = 1

(cf. (40)). Hence, Properties 1 (P1) and (P2) imply that one should have

q_{λ}^{U} \leq β_{λ}

. Then, the sequence

{(b_{n}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

is also non-increasing. However, since

b_{n}^{(p_{λ}^{U}, q_{λ}^{U})}

might be positive for some (even all)

n \in N

, the sum

{(\sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

is not necessarily decreasing. Nevertheless, the restriction

q_{λ}^{U} - β_{λ} \leq 0 and p_{λ}^{U} - α_{λ} \leq 0, where at least one of the inequalities is strict,

(47)

ensures that both sequences

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

and

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{U})})}_{n \in N}

are nonpositive and decreasing, where at least one sequence is strictly negative, implying that the sum

{(\sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

is strictly negative for

n \geq 2

and strictly decreasing. To see this, suppose that (47) is satisfied with two strict inequalities. Then,

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

as well as

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{U})})}_{n \in N}

are strictly negative and strictly decreasing. If

q_{λ}^{U} = β_{λ}

and

p_{λ}^{U} < α_{λ}

, we see from (P2) and (P6) that

a_{n}^{(q_{λ}^{U})} \equiv 0

and that

b_{n}^{(p_{λ}^{U}, q_{λ}^{U})} \equiv p_{λ}^{U} - α_{λ} < 0

(notice that

α_{λ} = 0

is not possible in the current setup

P_{SP} \ P_{SP, 1}

and for

λ \in] 0, 1 [

). In the last case

q_{λ}^{U} < β_{λ}

and

p_{λ}^{U} = α_{λ}

, from (P1) and (P5) it follows that

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

is strictly negative and strictly decreasing, as well as that

b_{1}^{(p_{λ}^{U}, q_{λ}^{U})} = 0

and

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{U})})}_{n \in N}

is strictly decreasing and strictly negative for

n \geq 2

. Thus, whenever (47) is satisfied, the sum

{(\sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

is strictly negative for

n \geq 2

and strictly decreasing.

To achieve Goal (G2), we have to require that the sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

converges, which is the case if either

q_{λ}^{U} \leq β_{λ}

or

β_{λ} < q_{λ}^{U} \leq min {1, e^{β_{λ} - 1}}

(cf. Properties 1 (P1) to (P3)). From the upper bound component

{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}

(42) we conclude that Goal (G2) is met if the sequence

{(b_{n}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

converges to a negative limit, i.e.,

{lim}_{n \to \infty} b_{n}^{(p_{λ}^{U}, q_{λ}^{U})} = p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} < 0

. Notice that this condition holds true if (47) is satisfied: suppose that

q_{λ}^{U} < β_{λ}

, then

x_{0}^{(q_{λ}^{U})} < 0

and

p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} < p_{λ}^{U} - α_{λ} \leq 0

. On the other hand, if

p_{λ}^{U} - α_{λ} < 0

, one obtains

x_{0}^{(q_{λ}^{U})} \leq 0

leading to

p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} \leq p_{λ}^{U} - α_{λ} < 0

.

The examination of Goal (G2) above enters into the discussion of Goal (G3): if the sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

converges and

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{U} = 0

, then there holds

lim_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{U}) = lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}) = p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} .

(48)

For the case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

, let us now start with our comprehensive investigations of the upper bounds, where we focus on fulfilling the condition (47) which tackles Goals (G1) and (G2) simultaneously; then, the Goal (G3) can be achieved by (48). As indicated above, various different parameter subcases can lead to different Hellinger-integral-upper-bound details, which we work out in the following. For better transparency, we employ the following notations (where the first four are just reminders of sets which were already introduced above)

\begin{matrix} P_{NI} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in {[0, \infty [}^{4} : α_{A} = α_{H} = 0; β_{A} > 0; β_{H} > 0; β_{A} \neq β_{H}\}, \\ P_{SP} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in {] 0, \infty [}^{4} : (α_{A} \neq α_{H}) or (β_{A} \neq β_{H}) or both\}, \\ P & : = & P_{NI} \cup P_{SP}, \\ P_{SP, 1} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} = \frac{α_{H}}{β_{H}}\}, \\ P_{SP, 2} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} = α_{H}, β_{A} \neq β_{H}\}, \\ P_{SP, 3} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}\} = P_{SP, 3 a} \cup P_{SP, 3 b} \cup P_{SP, 3 c}, \\ P_{SP, 3 a} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in] - \infty, 0 [\}, \\ P_{SP, 3 b} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in] 0, \infty [\ N\}, \\ P_{SP, 3 c} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in N\}, \\ P_{SP, 4} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H} > 0, β_{A} = β_{H}\} = P_{SP, 4 a} \cup P_{SP, 4 b}, \\ P_{SP, 4 a} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H} > 0, β_{A} = β_{H} \in] 0, 1 [\}, \\ P_{SP, 4 b} & : = & \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H} > 0, β_{A} = β_{H} \in [1, \infty [\}; \end{matrix}

(49)

notice that because of Lemma A1 and of the Properties 3 (P15) one gets on the domain

] 0, \infty [

the relation

ϕ_{λ} (x) = 0

iff

f_{A} (x) = f_{H} (x)

iff

x = x^{*} : = \frac{α_{H} - α_{A}}{β_{A} - β_{H}} \in] 0, \infty [

.

3.7. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times] 0, 1 [$

For this parameter constellation, one has

ϕ_{λ} (0) = 0

and

ϕ_{λ}^{'} (0) = 0

(cf. Properties 3 (P16), (P17)). Thus, the only admissible intercept choice satisfying (47) is

r_{λ}^{U} = 0 = p_{λ}^{U} - α_{λ}

(i.e.,

p_{λ}^{U} = p^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ) = α_{λ} = α > 0

), and the minimal admissible slope which implies (35) for all

x \in N_{0}

is given by

s_{λ}^{U} = \frac{ϕ_{λ} (1) - ϕ_{λ} (0)}{1 - 0} = q_{λ}^{U} - β_{λ} = a_{1}^{(q_{λ}^{U})} < 0

(i.e.,

q_{λ}^{U} = q^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ) = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α > 0

). Analogously to the investigation for

P_{SP, 1}

in the above-mentioned Section 3.3, one can derive that

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

is strictly negative, strictly decreasing, and converges to

x_{0}^{(q_{λ}^{U})} \in] - β_{λ}, q_{λ}^{U} - β_{λ} [

as indicated in Properties 1 (P1). Moreover, in the same manner as for the case

P_{SP, 1}

this leads to

Proposition 7.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{U} = α, q_{λ}^{U} = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{U} = exp \{(q_{λ}^{U} - β_{λ}) \cdot X_{0}\} < 1, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{U})}_{n \in N} of upper bounds for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{U} = exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} \\ is strictly decreasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{U} = 0 = lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}), \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{U} = p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} = α (e^{x_{0}^{(q_{λ}^{U})}} - 1) < 0 . \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{U} is strictly decreasing . \end{matrix}

3.8. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times] 0, 1 [$

From Properties 3 (P16) one gets

ϕ_{λ} (0) < 0

, whereas

ϕ_{λ}^{'} (0)

can assume any sign, take e.g., the parameters

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (1.8, 0.9, 2.7, 0.7, 0.5)

for

ϕ_{λ}^{'} (0) < 0

,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (1.8, 0.9, 2.8, 0.7, 0.5)

for

ϕ_{λ}^{'} (0) = 0

and

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (1.8, 0.9, 2.9, 0.7, 0.5)

for

ϕ_{λ}^{'} (0) > 0

; within our running-example epidemiological context of Section 2.3, this corresponds to a “nearly dangerous” infectious-disease-transmission situation

(H)

(with nearly critical reproduction number

β_{H} = 0.9

and importation mean of

α_{H} = 0.7

), whereas

(A)

describes a “dangerous” situation (with supercritical

β_{A} = 1.8

and

α_{A} = 2.7, 2.8, 2.9

). However, in all three subcases there holds

{max}_{x \in N_{0}} ϕ_{λ} (x) \leq {max}_{x \in [0, \infty [} ϕ_{λ} (x) < 0

. Thus, there clearly exist parameters

p_{λ}^{U} = p^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ),

q_{λ}^{U} = q^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

with

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ} [

and

q_{λ}^{U} \in [β_{A}^{λ} β_{H}^{1 - λ}, β_{λ} [

(implying (47)) such that (35) is satisfied. As explained above, we get the following

Proposition 8.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times] 0, 1 [

there exist parameters

p_{λ}^{U}, q_{λ}^{U}

which satisfy

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ} [

and

q_{λ}^{U} \in [β_{A}^{λ} β_{H}^{1 - λ}, β_{λ} [

as well as (35) for all

x \in N_{0}

, and for all such pairs

(p_{λ}^{U}, q_{λ}^{U})

and all initial population sizes

X_{0} \in N

there holds

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{U} = exp \{(q_{λ}^{U} - β_{λ}) \cdot X_{0} + p_{λ}^{U} - α_{λ}\} < 1, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{U})}_{n \in N} of upper bounds for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{U} = exp \{a_{n}^{(q_{λ}^{U})} X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} \\ is strictly decreasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{U} = 0 = lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}), \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{U} = p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} < 0, \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{U} is strictly decreasing . \end{matrix}

Notice that all parts of this proposition also hold true for parameter pairs

(p_{λ}^{U}, q_{λ}^{U})

satisfying (35) and additionally either

p_{λ}^{U} = α_{λ}

,

q_{λ}^{U} < β_{λ}

or

p_{λ}^{U} < α_{λ}

,

q_{λ}^{U} = β_{λ}

.

Let us briefly illuminate the above-mentioned possible parameter choices, where we begin with the case of

ϕ_{λ}^{'} (0) \leq 0

, which corresponds to

λ β_{A} {(α_{A} / α_{H})}^{λ - 1} + (1 - λ) β_{H} {(α_{A} / α_{H})}^{λ} - β_{λ} \leq 0

(cf. (P17)); then, the function

ϕ_{λ} (\cdot)

is strictly negative, strictly decreasing, and–due to (P19)–strictly concave (and thus, the assumption

\frac{α_{H} - α_{A}}{β_{A} - β_{H}} < 0

is superfluous here). One pragmatic but yet reasonable parameter choice is the following: take any intercept

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ}]

such that

(p_{λ}^{U} - α_{λ}) + 2 (ϕ_{λ} (1) - (p_{λ}^{U} - α_{λ})) \geq ϕ_{λ} (2)

(i.e.,

2 {(α_{A} + β_{A})}^{λ} {(α_{H} + β_{H})}^{1 - λ} - p_{λ}^{U} + α_{λ} \geq {(α_{A} + 2 β_{A})}^{λ} {(α_{H} + 2 β_{H})}^{1 - λ}

) and

q_{λ}^{U} : = ϕ_{λ} (1) - (p_{λ}^{U} - α_{λ}) + β_{λ} = {(α_{A} + β_{A})}^{λ} {(α_{H} + β_{H})}^{1 - λ} - p_{λ}^{U}

, which corresponds to a linear function

ϕ_{λ}^{U}

which is (i) nonpositive on

N_{0}

and strictly negative on

N

, and (ii) larger than or equal to

ϕ_{λ}

on

N_{0}

, strictly larger than

ϕ_{λ}

on

N \ {1, 2}

, and equal to

ϕ_{λ}

at the point

x = 1

(“discrete tangent or secant line through

x = 1

”). One can easily see that (due to the restriction (34)) not all

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ}]

might qualify for the current purpose. For the particular choice

p_{λ}^{U} = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{U} = {(α_{A} + β_{A})}^{λ} {(α_{H} + β_{H})}^{1 - λ} - α_{A}^{λ} α_{H}^{1 - λ}

one obtains

r_{λ}^{U} = p_{λ}^{U} - α_{λ} = b_{1}^{(p_{λ}^{U}, q_{λ}^{U})} < 0

(cf. Lemma A1) and

s_{λ}^{U} = q_{λ}^{U} - β_{λ} = ϕ_{λ} (1) - ϕ_{λ} (0) = a_{1}^{(q_{λ}^{U})} < 0

(secant line through

ϕ_{λ} (0)

and

ϕ_{λ} (1)

).

For the remaining case

ϕ_{λ}^{'} (0) > 0

, which corresponds to

λ β_{A} {(α_{A} / α_{H})}^{λ - 1} + (1 - λ) β_{H} {(α_{A} / α_{H})}^{λ} - β_{λ} > 0

, the function

ϕ_{λ} (\cdot)

is strictly negative, strictly concave and hump-shaped (cf. (P18)). For the derivation of the parameter choices, we employ

x_{max} : = {argmax}_{x \in] 0, \infty [} ϕ_{λ} (x)

which is the unique solution of

λ β_{A} [{(\frac{f_{A} (x)}{f_{H} (x)})}^{λ - 1} - 1] + (1 - λ) β_{H} [{(\frac{f_{A} (x)}{f_{H} (x)})}^{λ} - 1] = 0, x \in] 0, \infty [,

(50)

(cf. (P17), (P19)); notice that

x = x^{*} : = \frac{α_{H} - α_{A}}{β_{A} - β_{H}} \in] 0, \infty [

formally satisfies the Equation (50) but does not qualify because of the current restriction

x^{*} < 0

.

Let us first inspect the case

ϕ_{λ} (⌊ x_{max} ⌋) > ϕ_{λ} (⌊ x_{max} ⌋ + 1)

, where

⌊ x ⌋

denotes the integer part of x. Consider the subcase

ϕ_{λ} (⌊ x_{max} ⌋) + ⌊ x_{max} ⌋ (ϕ_{λ} (⌊ x_{max} ⌋) - ϕ_{λ} (⌊ x_{max} ⌋ + 1)) \leq 0

, which means that the secant line through

ϕ_{λ} (⌊ x_{max} ⌋)

and

ϕ_{λ} (⌊ x_{max} ⌋ + 1)

possesses a non-positive intercept. In this situation it is reasonable to choose as intercept any

p_{λ}^{U} - α_{λ} = b_{1}^{(p_{λ}^{U}, q_{λ}^{U})} = r_{λ}^{U} \in [ϕ_{λ} (⌊ x_{max} ⌋), ϕ_{λ} (⌊ x_{max} ⌋) + ⌊ x_{max} ⌋ (ϕ_{λ} (⌊ x_{max} ⌋) - ϕ_{λ} (⌊ x_{max} ⌋ + 1))]

, and as corresponding slope

q_{λ}^{U} - α_{λ} = a_{1}^{(q_{λ}^{U})} = s_{λ}^{U} = \frac{ϕ_{λ} (⌊ x_{max} ⌋) - r_{λ}^{U}}{(⌊ x_{max} ⌋) - 0} \leq 0

. A larger intercept would lead to a linear function

ϕ_{λ}^{U}

for which (35) is not valid at

⌊ x_{max} ⌋ + 1

. In the other subcase

ϕ_{λ} (⌊ x_{max} ⌋) + x_{max} (ϕ_{λ} (⌊ x_{max} ⌋) - ϕ_{λ} (⌊ x_{max} ⌋ + 1)) > 0

, one can choose any intercept

p_{λ}^{U} - α_{λ} = b_{1}^{(p_{λ}^{U}, q_{λ}^{U})} = r_{λ}^{U} \in [ϕ_{λ} (⌊ x_{max} ⌋), 0]

and as corresponding slope

q_{λ}^{U} - α_{λ} = a_{1}^{(q_{λ}^{U})} = s_{λ}^{U} = \frac{ϕ_{λ} (⌊ x_{max} ⌋) - r_{λ}^{U}}{(⌊ x_{max} ⌋) - 0} \leq 0

(notice that the corresponding line

ϕ_{λ}^{U}

is on

] ⌊ x_{max} ⌋, \infty [

strictly larger than the secant line through

ϕ_{λ} (⌊ x_{max} ⌋)

and

ϕ_{λ} (⌊ x_{max} ⌋ + 1)

).

If

ϕ_{λ} (⌊ x_{max} ⌋) \leq ϕ_{λ} (⌊ x_{max} ⌋ + 1)

, one can proceed as above by substituting the crucial pair of points

(⌊ x_{max} ⌋, ⌊ x_{max} ⌋ + 1)

with

(⌊ x_{max} ⌋ + 1, ⌊ x_{max} ⌋ + 2)

and examining the analogous two subcases.

3.9. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times] 0, 1 [$

The only difference to the preceding Section 3.8 is that–due to Properties 3 (P15)–the maximum value of

ϕ_{λ} (\cdot)

now achieves 0, at the positive non-integer point

x_{max} = x^{*} = \frac{α_{H} - α_{A}}{β_{A} - β_{H}} \in] 0, \infty [\ N

(take e.g.,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (1.8, 0.9, 1.1, 3.0, 0.5)

as an example, which within our running-example epidemiological context of Section 2.3 corresponds to a “nearly dangerous” infectious-disease-transmission situation

(H)

(with nearly critical reproduction number

β_{H} = 0.9

and importation mean of

α_{H} = 3

), whereas

(A)

describes a “dangerous” situation (with supercritical

β_{A} = 1.8

and

α_{A} = 1.1

)); this implies that

ϕ_{λ} (x) < 0

for all x on the relevant subdomain

N_{0}

. Due to (P16), (P17) and (P19) one gets automatically

λ β_{A} {(α_{A} / α_{H})}^{λ - 1} + (1 - λ) β_{H} {(α_{A} / α_{H})}^{λ} - β_{λ} > 0

for all

λ \in] 0, 1 [

. Analogously to Section 3.8, there exist parameter

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ}]

and

q_{λ}^{U} \in [β_{A}^{λ} β_{H}^{1 - λ}, β_{λ}]

such that (47) and (35) are satisfied. Thus, all the assertions (a) to (e) of Proposition 8 also hold true for the current parameter constellations.

3.10. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times] 0, 1 [$

The only difference to the preceding Section 3.9 is that the maximum value of

ϕ_{λ} (\cdot)

now achieves 0 at the integer point

x_{max} = x^{*} = \frac{α_{H} - α_{A}}{β_{A} - β_{H}} \in N

(take e.g.,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (1.8, 0.9, 1.2, 3.0, 0.5)

as an example). Accordingly, there do not exist parameters

p_{λ}^{U}, q_{λ}^{U}

, such that (35) and (47) are satisfied simultaneously. The only parameter pair that ensures

exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} \leq 1

for all

n \in N

and all

X_{0} \in N

without further investigations, leads to the choices

p_{λ}^{U} = α_{λ}

as well as

q_{λ}^{U} = β_{λ}

. Consequently,

B_{λ, X_{0}, n}^{U} \equiv 1

, which coincides with the general upper bound (9), but violates the above-mentioned desired Goal (G1). However, there might exist parameters

p_{λ}^{U} < α_{λ}, q_{λ}^{U} > β_{λ}

or

p_{λ}^{U} > α_{λ}, q_{λ}^{U} < β_{λ}

, such that at least the parts (c) and (d) of Proposition 8 are satisfied. Nevertheless, by using a conceptually different method we can prove

H_{λ} (P_{A, n} ∥ P_{H, n}) < 1 \forall n \in N \ {1} as well as the convergence lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0

(51)

which will be used for the study of complete asymptotical distinguishability (entire separation) below. This proof is provided in Appendix A.1.

3.11. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [$

This setup and the remaining setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [

(see the next Section 3.12) are the only constellations where

ϕ_{λ} (\cdot)

is strictly negative and strictly increasing, with

{lim}_{x \to \infty} ϕ_{λ} (x) = {lim}_{x \to \infty} ϕ_{λ}^{'} (x) = 0

, leading to the choices

p_{λ}^{U} = α_{λ}

as well as

q_{λ}^{U} = β_{λ} = β

under the restriction that

exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} \leq 1

for all

n \in N

and all

X_{0} \in N

. Consequently, one has

B_{λ, X_{0}, n}^{U} \equiv 1

, which is consistent with the general upper bound (9) but violates the above-mentioned desired Goal (G1). Unfortunately, the proof method of (51) (cf. Appendix A.1) can’t be carried over to the current setup. The following proposition states two of the above-mentioned desired assertions which can be verified by a completely different proof method, which is also given in Appendix A.1.

Proposition 9.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [

there exist parameters

p_{λ}^{U} < α_{λ}

,

1 > q_{λ}^{U} > β_{λ} = β

such that (35) is satisfied for all

x \in [0, \infty [

and such that for all initial population sizes

X_{0} \in N

the parts (c) and (d) of Proposition 8 hold true.

3.12. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [$

The assertions preceding Proposition 9 remain valid. However, any linear upper bound of the function

ϕ_{λ} (\cdot)

on the domain

N_{0}

possesses the slope

q_{λ}^{U} - β_{λ} \geq 0

. If

q_{λ}^{U} = β_{λ}

, then the intercept is

p_{λ}^{U} - α_{λ} = 0

leading to

B_{λ, X_{0}, n}^{U} \equiv 1

and thus Goal (G1) is violated. If we use a slope

q_{λ}^{U} - β_{λ} > 0

, then both the sequences

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

and

{(b_{n}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

are strictly increasing and diverge to ∞. This comes from Properties 1 (P3b) and (P7b) since

q_{λ}^{U} > β_{λ} = β \geq 1

. Altogether, this implies that the corresponding upper bound component

{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}

(cf. (42)) diverges to ∞ as well. This leads to

Proposition 10.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there do not exist parameters

p_{λ}^{U} \geq 0

,

q_{λ}^{U} \geq 0

such that (35) is satisfied and such that the parts (c) and (d) of Proposition 8 hold true.

3.13. Concluding Remarks on Alternative Upper Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in$ $(P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

As mentioned earlier on, starting from Section 3.6 we have principally focused on constructing upper bounds

B_{λ, X_{0}, n}^{U}

of the Hellinger integrals, starting from

p_{λ}^{U}, q_{λ}^{U}

which fulfill (35) as well as further constraints depending on the Goals (G1) and (G2). For the setups in the Section 3.7, Section 3.8 and Section 3.9, we have proved the existence of special parameter choices

p_{λ}^{U}, q_{λ}^{U}

which were consistent with (G1) and (G2). Furthermore, for the constellation in the Section 3.11 we have found parameters such that at least (G2) is satisfied. In contrast, for the setup of Section 3.12 we have not found any choices which are consistent with (G1) and (G2), leading to the “cut-off bound”

B_{λ, X_{0}, n}^{U} \equiv 1

which gives no improvement over the generally valid upper bound (9).

In the following, we present some alternative choices of

p_{λ}^{U}, q_{λ}^{U}

which–depending on the parameter constellation

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

–may or may not lead to upper bounds

B_{λ, X_{0}, n}^{U}

which are consistent with Goal (G1) or with (G2) (and which are maybe weaker or better than resp. incomparable with the previous upper bounds when dealing with some relaxations of (G1), such as e.g.,

H_{λ} (P_{A, n} ∥ P_{H, n}) < 1

for all but finitely many

n \in N

).

As a first alternative choice for a linear upper bound of

ϕ_{λ} (\cdot)

(cf. (35)) one could use the asymptote

\tilde{ϕ_{λ}} (\cdot)

(cf. Properties 3 (P20)) with the parameters

p_{λ}^{U} : = \tilde{p_{λ}} = λ α_{A} {(β_{A} / β_{H})}^{λ - 1} + (1 - λ) α_{H} {(β_{A} / β_{H})}^{λ}

and

q_{λ}^{U} : = \tilde{q_{λ}} = β_{A}^{λ} β_{H}^{1 - λ}

. Another important linear upper bound of

ϕ_{λ} (\cdot)

is the tangent line

ϕ_{λ, y}^{\tan} (\cdot)

on

ϕ_{λ} (\cdot)

at an arbitrarily fixed point

y \in [0, \infty [

, which amounts to

ϕ_{λ, y}^{\tan} (x) : = r_{λ, y}^{\tan} + s_{λ, y}^{\tan} \cdot x : = (p_{λ, y}^{\tan} - α_{λ}) + (q_{λ, y}^{\tan} - β_{λ}) \cdot x : = (ϕ_{λ} (y) - y \cdot ϕ_{λ}^{'} (y)) + ϕ_{λ}^{'} (y) \cdot x,

(52)

where

ϕ_{λ}^{'} (\cdot)

is given by (P17). Notice that this upper bound is for

y \in] 0, \infty [\ N

“not tight” in the sense that

ϕ_{λ, y}^{\tan} (\cdot)

does not hit the function

ϕ_{λ} (\cdot)

on

N_{0}

(where the generation sizes “live”); moreover,

ϕ_{λ, y}^{\tan} (x)

might take on strictly positive values for large enough points x which is counter-productive for Goal (G1). Another alternative choice of a linear upper bound for

ϕ_{λ} (\cdot)

, which in contrast to the tangent line is “tight” (but not necessarily avoiding the strict positivity), is the secant line

ϕ_{λ, k}^{\sec} (\cdot)

across its arguments k and

k + 1

, given by

\begin{matrix} ϕ_{λ, k}^{\sec} (x) & : = & r_{λ, k}^{\sec} + s_{λ, k}^{\sec} \cdot x : = (p_{λ, k}^{\sec} - α_{λ}) + (q_{λ, k}^{\sec} - β_{λ}) \cdot x \\ : = & [ϕ_{λ} (k) - k \cdot (ϕ_{λ} (k + 1) - ϕ_{λ} (k))] + (ϕ_{λ} (k + 1) - ϕ_{λ} (k)) \cdot x . \end{matrix}

(53)

Another alternative choice is the horizontal line

ϕ_{λ}^{hor} (x) \equiv max \{ϕ_{λ} (y), y \in N_{0}\} .

(54)

For

p_{λ}^{U} \in \{\tilde{p_{λ}}, p_{λ, y}^{\tan}, p_{λ, y}^{\sec}\}

and

q_{λ}^{U} \in \{q_{λ, y}^{\tan}, q_{λ, y}^{\sec}\}

it is possible that in some parameter cases

(β_{A}, β_{H}, α_{A}, α_{H})

either the intercept

r_{λ}^{U} = p_{λ}^{U} - α_{λ}

is strictly larger than zero or the slope

s_{λ}^{U} = q_{λ}^{U} - β_{λ}

is strictly larger than zero. Thus, it can happen that

{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} > 1

for some (and even for all)

n \in N

, such that the corresponding upper bound

B_{λ, X_{0}, n}^{U}

for the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

amounts to the cut-off at 1. However, due to Properties 1 (P5) and (P7a), the sequence

{({\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})})}_{n \in N}

may become smaller than 1 and may finally converge to zero. Due to Properties 2 (P14), this upper bound can even be tighter (smaller) than those bounds derived from parameters

p_{λ}^{U}, q_{λ}^{U}

fulfilling (47).

As far as our desired Hellinger integral bounds are concerned, in the setup of Section 3.11—where

{lim}_{y \to \infty} ϕ_{λ, y}^{\tan} (\cdot) \equiv 0

–for the proof of Proposition 9 in Appendix A.1 we shall employ the mappings

y \mapsto ϕ_{λ, y}^{\tan}

resp.

y \mapsto p_{λ, y}^{\tan}

resp.

y \mapsto q_{λ, y}^{\tan}

. These will also be used for the proof of the below-mentioned Theorem 4.

3.14. Intermezzo 1: Application to Asymptotical Distinguishability

The above-mentioned investigations can be applied to the context of Section 2.6 on asymptotical distinguishability. Indeed, with the help of the Definitions 1 and 2 as well as the equivalence relations (25) and (26) we obtain the following

Corollary 1.

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 4 b}$ and all initial population sizes $X_{0} \in N$ , the corresponding sequences ${(P_{A, n})}_{n \in N_{0}}$ and ${(P_{H, n})}_{n \in N_{0}}$ are entirely separated (completely asymptotically distinguishable).
(b): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI}$ with $β_{A} \leq 1$ and all initial population sizes $X_{0} \in N$ , the sequence ${(P_{A, n})}_{n \in N_{0}}$ is contiguous to ${(P_{H, n})}_{n \in N_{0}}$ .
(c): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI}$ with $β_{A} > 1$ and all initial population sizes $X_{0} \in N$ , the sequence ${(P_{A, n})}_{n \in N_{0}}$ is neither contiguous to nor entirely separated to ${(P_{H, n})}_{n \in N_{0}}$ .

The proof of Corollary 1 will be given in Appendix A.1.

Remark 3.

(a): Assertion (c) of Corollary 1 contrasts the case of Gaussian processes with independent increments where one gets either entire separation or mutual contiguity (see e.g., Liese & Vajda [1]).
(b): By putting Corollary 1(b) and (c) together, we obtain for different “criticality pairs” in the non-immigration case $P_{NI}$ the following asymptotical distinguishability types: $(P_{A, n}) ◃ ▹ (P_{H, n})$ if $β_{A} \leq 1$ , $β_{H} \leq 1$ ; $(P_{A, n}) ◃ \bar{▹} (P_{H, n})$ if $β_{A} \leq 1$ , $β_{H} > 1$ ; $(P_{A, n}) \bar{◃} ▹ (P_{H, n})$ if $β_{A} > 1$ , $β_{H} \leq 1$ ; $(P_{A, n}) \bar{◃} \bar{▹} (P_{H, n})$ and $(P_{A, n}) \bar{△} (P_{H, n})$ if $β_{A} > 1$ , $β_{H} > 1$ ;in particular, for $P_{NI}$ the sequences ${(P_{A, n})}_{n \in N_{0}}$ and ${(P_{H, n})}_{n \in N_{0}}$ are not completely asymptotically inseparable (indistinguishable).
(c): In the light of the above-mentioned characterizations of contiguity resp. entire separation by means of Hellinger integral limits, the finite-time-horizon results on Hellinger integrals given in the “ $λ \in] 0, 1 [$ parts” of Theorem 1, the Section 3.3, Section 3.4, Section 3.5, Section 3.6, Section 3.7, Section 3.8, Section 3.9, Section 3.10, Section 3.11, Section 3.12, Section 3.13 and also in the below-mentioned Section 6 can loosely be interpreted as “finite-sample (rather than asymptotical) distinguishability” assertions.

3.15. Intermezzo 2: Application to Decision Making under Uncertainty

3.15.1. Bayesian Decision Making

The above-mentioned investigations can be applied to the context of Section 2.5 on dichotomous Bayesian decision making on the space of all possible path scenarios (path space) of Poissonian Galton-Watson processes without/with immigration GW(I) (e.g., in combination with our running-example epidemiological context of Section 2.3). More detailed, for the minimal mean decision loss (Bayes risk)

R_{n}

defined by (18) we can derive upper (respectively lower) bounds by using (19) respectively (20) together with the exact values or the upper (respectively lower) bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

derived in the “

λ \in] 0, 1 [

parts” of Theorem 1, the Section 3.3, Section 3.4, Section 3.5, Section 3.6, Section 3.7, Section 3.8, Section 3.9, Section 3.10, Section 3.11, Section 3.12, Section 3.13 (and also in the below-mentioned Section 6); instead of providing the corresponding outcoming formulas–which is merely repetitive–we give the illustrative

Example 1.

Based on a sample path observation

X_{n} : = {X_{ℓ} : ℓ = 1, . . ., n}

of a GWI, which is either governed by a hypothesis law

P_{H}

or an alternative law

P_{A}

, we want to make a dichotomous optimal Bayesian decision described in Section 2.5, namely, decide between an action

d_{H}

“associated with”

P_{H}

and an action

d_{A}

“associated with”

P_{A}

, with pregiven loss function (16) involving constants

L_{A} > 0

,

L_{H} > 0

which e.g., arise as bounds from quantities in worst-case scenarios.

For this, let us exemplarily deal with initial population

X_{0} = 5

as well as parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (1.2, 0.9, 4, 3) \in P_{SP, 1}

; within our running-example epidemiological context of Section 2.3, this corresponds e.g., to a setup where one is encountered with a novel infectious disease (such as COVID-19) of non-negligible fatality rate, and

(A)

reflects a “potentially dangerous” infectious-disease-transmission situation (with supercritical reproduction number

β_{A} = 1.2

and importation mean of

α_{A} = 4

, for weekly appearing new incidence-generations) whereas

(H)

describes a “milder” situation (with subcritical

β_{H} = 0.9

and

α_{H} = 3

). Moreover, let

d_{H}

and

d_{A}

reflect two possible sets of interventions (control measures) in the course of pandemic risk management, with respective “worst-case type” decision losses

L_{A} = 600

and

L_{H} = 300

(e.g., in units of billion Euros or U.S. Dollars). Additionally we assume the prior probabilities

π = P r (H) = 1 - P r (A) = 0.5

, which results in the prior-loss constants

L_{A} = 300

and

L_{H} = 150

. In order to obtain bounds for the corresponding minimal mean decision loss (Bayes Risk)

R_{n}

defined in (18) we can employ the general Stummer-Vajda bounds (cf. [15]) (19) and (20) in terms of the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

(with arbitrary

λ \in] 0, 1 [

), and combine this with the appropriate detailed results on the latter from the preceding subsections. To demonstrate this, let us choose

λ = 0.5

(for which

H_{1 / 2} (P_{A, n} ∥ P_{H, n})

can be interpreted as a multiple of the Bhattacharyya coefficient between the two competing GWI) respectively

λ = 0.9

, leading to the parameters

p_{0.5}^{E} = 3.464, q_{0.5}^{E} = 1.039

respectively

p_{0.9}^{E} = 3.887

,

q_{0.9}^{E} = 1.166

(cf. (33)). Combining (19) and (20) with Theorem 1 (a)– which provides us with the exact recursive values of

H_{λ} (P_{A, n} ∥ P_{H, n})

in terms of the sequence

a_{n}^{(q_{λ}^{E})}

(cf. (36))– we obtain for

λ = 0.5

the bounds

\begin{matrix} R_{n} & \leq & R_{n}^{U} : = 2.121 \cdot 10^{2} \cdot exp \{5 \cdot a_{n}^{(1.039)} + \frac{10}{3} \cdot \sum_{k = 1}^{n} a_{k}^{(1.039)}\}, \\ R_{n} & \geq & R_{n}^{L} : = 100 \cdot exp \{10 \cdot a_{n}^{(1.039)} + \frac{20}{3} \cdot \sum_{k = 1}^{n} a_{k}^{(1.039)}\}, \end{matrix}

whereas for

λ = 0.9

we get

\begin{matrix} R_{n} & \leq & R_{n}^{U} : = 2.799 \cdot 10^{2} \cdot exp \{5 \cdot a_{n}^{(1.166)} + \frac{10}{3} \cdot \sum_{k = 1}^{n} a_{k}^{(1.166)}\}, \\ R_{n} & \geq & R_{n}^{L} : = 3.902 \cdot exp \{50 \cdot a_{n}^{(1.166)} + \frac{100}{3} \cdot \sum_{k = 1}^{n} a_{k}^{(1.166)}\} . \end{matrix}

Figure 1 illustrates the lower (orange resp. cyan) and upper (red resp. blue) bounds

R_{n}^{L}

resp.

R_{n}^{U}

of the Bayes Risk

R_{n}

employing

λ = 0.5

resp.

λ = 0.9

on both a unit scale (left graph) and a logarithmic scale (right graph). The lightgrey/grey/black curves correspond to the (18)-based empirical evaluation of the Bayes risk sequence

{(R_{n}^{sample})}_{n = 1, . . ., 50}

from three independent Monte Carlo simulations of 10000 GWI sample paths (each) up to time horizon 50.

3.15.2. Neyman-Pearson Testing

By combining (23) with the exact values resp. upper bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

from the preceding subsections, we obtain for our context of GW(I) with Poisson offspring and Poisson immigration (including the non-immigration case) some upper bounds of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

in the class of the tests for which the type I error probability is at most

ς \in] 0, 1 [

, which can also be immediately rewritten as lower bounds for the power

1 - E_{ς} (P_{A, n} ∥ P_{H, n})

of a most powerful test at level

ς

. As for the Bayesian context of Section 3.15.1, instead of providing the–merely repetitive–outcoming formulas for the bounds of

E_{ς} (P_{A, n} ∥ P_{H, n})

we give the illustrative

Example 2.

Consider the Figure 2 and Figure 3 which deal with initial population

X_{0} = 5

and the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 1.2, 1, 4) \in P_{SP, 1}

; within our running-example epidemiological context of Section 2.3, this corresponds to a “potentially dangerous” infectious-disease-transmission situation

(H)

(with supercritical reproduction number

β_{H} = 1.2

and importation mean of

α_{H} = 4

), whereas

(A)

describes a “very mild” situation (with “low” subcritical

β_{A} = 0.3

and

α_{A} = 1

). Figure 2 shows the lower and upper bounds of

E_{ς} (P_{A, n} ∥ P_{H, n})

with

ς = 0.05

, evaluated from the Formulas (23) and (24), together with the exact values of the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

, cf. Theorem 1 (recall that we are in the setup

P_{SP, 1}

) on both a unit scale (left graph) and a logarithmic scale (right graph). The orange resp. red resp. purple curves correspond to the outcoming upper bounds

E_{n}^{U} : = E_{n}^{U} (P_{A, n} ∥ P_{H, n})

(cf. (23)) with parameters

λ = 0.3

resp.

λ = 0.5

resp.

λ = 0.7

. The green resp. cyan resp. blue curves correspond to the lower bounds

E_{n}^{L} : = E_{n}^{L} (P_{A, n} ∥ P_{H, n})

(cf. (24)) with parameters

λ = 2

resp.

λ = 1.5

resp.

λ = 1.1

. Notice the different λ-ranges in (23) and (24). In contrast, Figure 3 compares the lower bound

E_{n}^{L}

(for fixed

λ = 1.1

) with the upper bound

E_{n}^{U}

(for fixed

λ = 0.5

) of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

for different levels

ς = 0.1

(orange for the lower and cyan for the upper bound),

ς = 0.05

(green and magenta) and

ς = 0.01

(blue and purple) on both a unit scale (left graph) and a logarithmic scale (right graph).

3.16. Goals for Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

Recall from (49) the set

P_{SP} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in {] 0, \infty [}^{4} : (α_{A} \neq α_{H}) or (β_{A} \neq β_{H}) or both\}

and the “equal-fraction-case” set

P_{SP, 1} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} = \frac{α_{H}}{β_{H}}\}

, where for the latter we have derived in Theorem 1(a) and in Proposition 5 the exact recursive values for the time-behaviour of the Hellinger integrals

H_{λ} (P_{A, 1} ∥ P_{H, 1})

of order

λ \in R \ [0, 1]

. Moreover, recall that for the case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

we have obtained in the Section 3.4 and Section 3.5 some “optimal” linear lower bounds

ϕ_{λ}^{L} (\cdot)

for the strictly concave function

ϕ_{λ} (x) : = ϕ (x, β_{A}, β_{H}, α_{A}, α_{H}, λ)

on the domain

x \in [0, \infty [

; due to the monotonicity Properties 2 (P10) to (P12) of the sequences

{(a_{n}^{(q_{λ}^{L})})}_{n \in N}

and

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{L})})}_{n \in N}

, these bounds have led to the “optimal” recursive lower bound

B_{λ, X_{0}, n}^{L}

of the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

in (40) of Theorem 1(b)).

In contrast, the strict convexity of the function

ϕ_{λ} (\cdot)

in the case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])

implies that we cannot maximize both parameters

p_{λ}^{L}, q_{λ}^{L} \in R

simultaneously subject to the constraint (35). This effect carries over to the lower bounds

B_{λ, X_{0}, n}^{L}

of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

(cf. (41)); in general, these bounds cannot be maximized simultaneously for all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

.

Analogously to (46), one way to obtain “good” recursive lower bounds for

H_{λ} (P_{A, n} ∥ P_{H, n})

from (41) in Theorem 1 (b) is to solve the optimization problem,

(\bar{p_{λ}^{L}}, \bar{q_{λ}^{L}}) : = arg max_{(p_{λ}^{L}, q_{λ}^{L}) \in R^{2}} \{exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\}\} such that (35) is satisfied,

(55)

for each fixed initial population size

X_{0} \in N

and observation horizon

n \in N

. But due to the same reasons as explained right after (46), the optimization problem (55) seems to be not straightforward to solve explicitly. In a congeneric way as in the discussion of the upper bounds for the case

λ \in] 0, 1 [

above, we now have to look for suitable parameters

p_{λ}^{L}, q_{λ}^{L}

for the lower bound

B_{λ, X_{0}, n}^{L} \leq H_{λ} (P_{A, n} ∥ P_{H, n})

that fulfill (35) and that guarantee certain reasonable criteria and goals; these are similar to the goals (G1) to (G3) from Section 3.6, and are therefore supplemented by an additional “

^{'}

”:

(G1 $^{'}$ ): the validity of $B_{λ, X_{0}, n}^{L} > 1$ simultaneously for all initial configurations $X_{0} \in N$ , all observation horizons $n \in N$ and all $λ \in R \ [0, 1]$ , which leads to a strict improvement of the general upper bound $H_{λ} (P_{A, n} ∥ P_{H, n}) > 1$ (cf. (11));
(G2 $^{'}$ ): the determination of the long-term-limits ${lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n})$ respectively ${lim}_{n \to \infty} B_{λ, X_{0}, n}^{L}$ for all $X_{0} \in N$ and all $λ \in R \ [0, 1]$ ; in particular, one would like to check whether ${lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty$ ;
(G3 $^{'}$ ): the determination of the time-asymptotical growth rates ${lim}_{n \to \infty} \frac{1}{n} log (H_{λ} (P_{A, n} ∥ P_{H, n}))$ resp. ${lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L})$ for all $X_{0} \in N$ and all $λ \in R \ [0, 1]$ .

In the following, let us briefly discuss how these three goals can be achieved in principle, where we confine ourselves to parameters

p_{λ}^{L}, q_{λ}^{L}

which–in addition to (35)–fulfill the requirement

\{q_{λ}^{L} \geq max {0, β_{λ}} \land p_{λ}^{L} > max {0, α_{λ}}\} \lor \{q_{λ}^{L} > max {0, β_{λ}} \land p_{λ}^{L} \geq max {0, α_{λ}}\},

(56)

where ∧ is the logical “AND” and ∨ the logical “OR” operator. This is sufficient to tackle all three Goals (G1

^{'}

) to (G3

^{'}

). To see this, assume that

p_{λ}^{L}, q_{λ}^{L}

satisfy (35). Let us begin with the two “extremal” cases in (56), i.e., with (i)

q_{λ}^{L} = max {0, β_{λ}}, p_{λ}^{L} > max {0, α_{λ}}

, respectively (ii)

q_{λ}^{L} > max {0, β_{λ}}, p_{λ}^{L} = max {0, α_{λ}}

.

Suppose in the first extremal case (i) that

β_{λ} \leq 0

. Then,

q_{λ}^{L} = 0

and Properties 1 (P4) implies that

a_{n}^{(q_{λ}^{L})} = - β_{λ} \geq 0

and hence

b_{n}^{(p_{λ}^{L}, q_{λ}^{L})} = p_{λ}^{L} e^{- β_{λ}} - α_{λ} \geq p_{λ}^{L} - α_{λ} > 0

for all

n \in N

. This enters into (41) as follows: the Hellinger integral lower bound becomes

B_{λ, X_{0}, n}^{L} \geq {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} = exp {- β_{λ} \cdot X_{0} + (p_{λ}^{L} e^{- β_{λ}} - α_{λ}) \cdot n} > 1

. Furthermore, one clearly has

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty

as well as

{lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = p_{λ}^{L} e^{- β_{λ}} - α_{λ} > 0

. Assume now that

β_{λ} > 0

. Then,

q_{λ}^{L} = β_{λ} > 0

,

a_{n}^{(q_{λ}^{L})} = 0

(cf. (P2)),

b_{n}^{(p_{λ}^{L}, q_{λ}^{L})} = p_{λ}^{L} - α_{λ} > 0

and thus

B_{λ, X_{0}, n}^{L} = exp {(p_{λ}^{L} - α_{λ}) \cdot n} > 1

for all

n \in N

. Furthermore, one gets

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty

as well as

{lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = p_{λ}^{L} - α_{λ} > 0

.

Let us consider the other above-mentioned extremal case (ii). Suppose that

q_{λ}^{L} > max {0, β_{λ}}

together with

q_{λ}^{L} > min {1, e^{β_{λ} - 1}}

which implies that the sequence

{(a_{n}^{(q_{λ}^{L})})}_{n \in N}

is strictly positive, strictly increasing and grows to infinity faster than exponentially, cf. (P3b). Hence,

B_{λ, X_{0}, n}^{L} \geq exp {a_{n}^{(q_{λ}^{L})} \cdot X_{0}} > 1

,

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty

as well as

{lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = \infty

. If

max {0, β_{λ}} < q_{λ}^{L} \leq min {1, e^{β_{λ} - 1}}

, then

{(a_{n}^{(q_{λ}^{L})})}_{n \in N}

is strictly positive, strictly increasing and converges to

x_{0}^{(q_{λ})} \in] 0, - log (q_{λ}^{L})]

(cf. (P3a)). This carries over to the sequence

{(b_{n}^{(p_{λ}^{L}, q_{λ}^{L})})}_{n \in N}

: one gets

b_{1}^{(p_{λ}^{L}, q_{λ}^{L})} = p_{λ}^{L} - α_{λ} \geq 0

and

b_{n}^{(p_{λ}^{L}, q_{λ}^{L})} > 0

for all

n \geq 2

. Furthermore,

b_{n}^{(p_{λ}^{L}, q_{λ}^{L})}

is strictly increasing and converges to

p_{λ}^{L} \cdot e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} > 0

, leading to

B_{λ, X_{0}, n}^{L} > 1

for all

n \in N

, to

{lim}_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty

as well as to

{lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = p_{λ}^{L} \cdot e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} > 0

.

It remains to look at the cases where

p_{λ}^{L}, q_{λ}^{L}

satisfy (35), and (56) with two strict inequalities. For this situation, one gets

${(a_{n}^{(q_{λ}^{L})})}_{n \in N}$ is strictly positive, strictly increasing and–iff $q_{λ}^{L} \leq min {1, e^{β_{λ} - 1}}$ –convergent (namely to the smallest positive solution $x_{0}^{(q_{λ}^{L})} \in] 0, - log (q_{λ}^{L})]$ of (44)), cf. (P3);
${(b_{n}^{(p_{λ}^{L}, q_{λ}^{L})})}_{n \in N}$ is strictly increasing, strictly positive (since $b_{1}^{(p_{λ}^{L}, q_{λ}^{L})} = p_{λ}^{L} - α_{λ} > 0$ ) and–iff $q_{λ}^{L} \leq min {1, e^{β_{λ} - 1}}$ –convergent (namely to $p_{λ}^{L} e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} \in [p_{λ}^{L} - α_{λ}, p_{λ}^{L} / q_{λ}^{L} - α_{λ}]$ ), cf (P7).

Hence, under the assumptions (35) and

(p_{λ}^{L} > max {0, α_{λ}}) \land (q_{λ}^{L} > max {0, β_{λ}})

the corresponding lower bounds

B_{λ, X_{0}, n}^{L}

of the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

fulfill for all

X_{0} \in N

$B_{λ, X_{0}, n}^{L} > 1$ for all $n \in N$ ,
${lim}_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty$ ,
${lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = p_{λ}^{L} e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} > 0$ for the case $q_{λ}^{L} \in] max {0, β_{λ}}, min {1, e^{β_{λ} - 1}}]$ , respectively ${lim}_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) = \infty$ for the remaining case $q_{λ}^{L} > min {1, e^{β_{λ} - 1}}$ .

Putting these considerations together we conclude that the constraints (35) and (56) are sufficient to achieve the Goals (G1

^{'}

) to (G3

^{'}

). Hence, for fixed parameter constellation

(β_{A}, β_{H}, α_{A}, α_{H}, λ)

, we aim for finding

p_{λ}^{L} = p^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q_{λ}^{L} = q^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

which satisfy (35) and (56). This can be achieved mostly, but not always, as we shall show below. As an auxiliary step for further investigations, it is useful to examine the set of all

λ \in R \ [0, 1]

for which

α_{λ} \leq 0

or

β_{λ} \leq 0

(or both). By straightforward calculations, we see that

α_{λ} \leq 0 ⟺ λ \{\begin{matrix} \leq & \frac{- α_{H}}{α_{A} - α_{H}}, & if α_{A} > α_{H}, \\ \geq & \frac{α_{H}}{α_{H} - α_{A}}, & if α_{A} < α_{H}, \end{matrix} and β_{λ} \leq 0 ⟺ λ \{\begin{matrix} \leq & \frac{- β_{H}}{β_{A} - β_{H}}, & if β_{A} > β_{H}, \\ \geq & \frac{β_{H}}{β_{H} - β_{A}}, & if β_{A} < β_{H} . \end{matrix}

(57)

Furthermore, recall that (35) implies the general bounds

p_{λ}^{L} \leq α_{A}^{λ} α_{H}^{1 - λ} = φ_{λ} (0)

(being equivalent to the requirement

ϕ_{λ}^{L} (0) = ϕ_{λ} (0)

) and

q_{λ}^{L} \leq β_{A}^{λ} β_{H}^{1 - λ} = {\tilde{q}}_{λ}

(the latter being the maximal slope due to Properties 3 (P19), (P20)).

Let us now undertake the desired detailed investigations on lower and upper bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

of order

λ \in R \ [0, 1]

, for the various different subclasses of

P_{SP} \ P_{SP, 1}

.

3.17. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times (R \ [0, 1])$

In such a constellation, where

P_{SP, 2} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} = α_{H}, β_{A} \neq β_{H}\}

(cf. (49)), one gets

ϕ_{λ} (0) = 0

(cf. Properties 3 (P16)),

ϕ_{λ}^{'} (0) = 0

(cf. (P17)). Thus, the only choice for the intercept and the slope of the linear lower bound

ϕ_{λ}^{L} (\cdot)

for

ϕ_{λ} (\cdot)

, which satisfies (35) for all

x \in N

and (potentially) (56), is

r_{λ}^{L} = 0 = p_{λ}^{L} - α_{λ}

(i.e.,

p_{λ}^{L} = α_{λ} = α > 0

) and

s_{λ}^{L} = \frac{ϕ_{λ} (1) - ϕ_{λ} (0)}{1 - 0} = q_{λ}^{L} - β_{λ} = a_{1}^{(q_{λ}^{L})} > 0

(i.e.,

q_{λ}^{L} = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α

). However, since

p_{λ}^{L} = α_{λ} = α > 0

, the restriction (56) is fulfilled iff

q_{λ}^{L} > 0

, which is equivalent to

λ \in I_{SP, 2} : = \{\begin{matrix} ] \frac{log (\frac{α}{α + β_{H}})}{log (\frac{α + β_{A}}{α + β_{H}})}, 0 [\cup] 1, \infty [, & if β_{A} > β_{H}, \\ ] - \infty, 0 [\cup] 1, \frac{log (\frac{α}{α + β_{H}})}{log (\frac{α + β_{A}}{α + β_{H}})} [, & if β_{A} < β_{H} . \end{matrix}

(58)

Suppose that

λ \in I_{SP, 2}

. As we have seen above, from Properties 1 (P3a) and (P3b) one can derive that

{(a_{n}^{(q_{λ}^{L})})}_{n \in N}

is strictly positive, strictly increasing, and converges to

x_{0}^{(q_{λ}^{L})} \in] 0, - log (q_{λ}^{L})]

iff

q_{λ}^{L} \leq min {1, e^{β_{λ} - 1}}

, and otherwise it diverges to ∞. Notice that both cases can occur: consider the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (1.5, 0.5, 0.5, 0.5) \in P_{SP, 2}

, which leads to

I_{SP, 2} =] - 1, 0 [\cup] 1, \infty [

; within our running-example epidemiological context of Section 2.3, this corresponds to a “mild” infectious-disease-transmission situation

(H)

(with “low” reproduction number

β_{H} = 0.5

and importation mean of

α_{H} = 0.5

), whereas

(A)

describes a “dangerous” situation (with supercritical

β_{A} = 1.5

and

α_{A} = 0.5

). For

λ = - 0.5 \in I_{SP, 2}

one obtains

q_{λ}^{L} \approx 0.207 \leq min {1, e^{β_{λ} - 1}} \approx 0.368

, whereas for

λ = 2 \in I_{SP, 2}

one gets

q_{λ}^{L} = 3.5 > min {1, e^{β_{λ} - 1}} = 1

. Altogether, this leads to

Proposition 11.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times I_{SP, 2}

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{L} = α_{A} = α_{H} = α, q_{λ}^{L} = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{L} = {\tilde{B}}_{λ, X_{0}, 1}^{(p_{λ}^{L}, q_{λ}^{L})} = exp \{(q_{λ}^{L} - β_{λ}) \cdot X_{0}\} > 1, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{L})}_{n \in N} of lower bounds for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{L} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} = exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty = lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}), \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{L} = \{\begin{matrix} p_{λ}^{L} \cdot exp \{x_{0}^{(q_{λ}^{L})}\} - α > 0, & if q_{λ}^{L} \leq min \{1, e^{β_{λ} - 1}\}, \\ \infty, & if q_{λ}^{L} > min \{1, e^{β_{λ} - 1}\}, \end{matrix} \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{L} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} is strictly increasing . \end{matrix}

Nevertheless, for the remaining constellations

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times R \ (I_{SP, 2} \cup [0, 1])

, all observation time horizons

n \in N

and all initial population sizes

X_{0} \in N

one can still prove

1 < H_{λ} (P_{A, n} ∥ P_{H, n}) and lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty,

(59)

(i.e., the achievement of the Goals (G1

^{'}

), (G2

^{'}

)), which is done by a conceptually different method (without involving

p_{λ}^{L}, q_{λ}^{L}

) in Appendix A.1.

3.18. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times (R \ [0, 1])$

In the current setup, where

P_{SP, 3 a} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in] - \infty, 0 [\}

(cf. (49)), we always have either

(α_{A} > α_{H}) \land (β_{A} > β_{H})

or

(α_{A} < α_{H}) \land (β_{A} < β_{H})

. Furthermore, from Properties 3 (P16) we obtain

ϕ_{λ} (0) > 0

. As in the case

λ \in] 0, 1 [

, the derivative

ϕ_{λ}^{'} (0)

can assume any sign on

P_{SP, 3 a}

, take e.g.,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (2.2, 4.5, 1, 3, 2)

for

ϕ_{λ}^{'} (0) < 0

,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (2.25, 4.5, 1, 3, 2)

for

ϕ_{λ}^{'} (0) = 0

and

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (2.3, 4.5, 1, 3, 2)

for

ϕ_{λ}^{'} (0) > 0

(these parameter constellations reflect “dangerous” (

A

) versus “highly dangerous” (

H

) situations within our running-example epidemiological context of Section 2.3). Nevertheless, in all three subcases one gets

{min}_{x \in N_{0}} ϕ_{λ} (x) \geq {min}_{x \geq 0} ϕ_{λ} (x) > 0

. Thus, there exist parameters

p_{λ}^{L} \in] α_{λ}, α_{A}^{λ} α_{H}^{1 - λ}]

and

q_{λ}^{L} \in] β_{λ}, β_{A}^{λ} β_{H}^{1 - λ}]

which satisfy (35) (in particular,

p_{λ}^{L} - α_{λ} > 0, q_{λ}^{L} - β_{λ} > 0

). We now have to look for a condition which guarantees that these parameters additionally fulfill (56); such a condition is clearly that both

α_{λ} \geq 0

and

β_{λ} \geq 0

hold, which is equivalent (cf. (57)) with

λ \in I_{SP, 3 a}^{(\geq)} : = \{\begin{matrix} [max \{\frac{- α_{H}}{α_{A} - α_{H}}, \frac{- β_{H}}{β_{A} - β_{H}}\}, 0 [\cup] 1, \infty [, & if (α_{A} > α_{H}) \land (β_{A} > β_{H}), \\ [- \infty, 0 [\cup] 1, min \{\frac{α_{H}}{α_{H} - α_{A}}, \frac{β_{H}}{β_{H} - β_{A}}\}], & if (α_{A} < α_{H}) \land (β_{A} < β_{H}); \end{matrix}

recall that

α_{λ} = 0

and

β_{λ} = 0

cannot occur simultaneously in the current setup. If

α_{λ} \leq 0

and

β_{λ} \leq 0

, i.e., if

λ \in I_{SP, 3 a}^{(<)} : = \{\begin{matrix} ] - \infty, min \{\frac{- α_{H}}{α_{A} - α_{H}}; \frac{- β_{H}}{β_{A} - β_{H}}\}], & if (α_{A} > α_{H}) \land (β_{A} > β_{H}), \\ [max \{\frac{α_{H}}{α_{H} - α_{A}}; \frac{β_{H}}{β_{H} - β_{A}}\}, \infty [, & if (α_{A} < α_{H}) \land (β_{A} < β_{H}), \end{matrix}

then–due to the strict positivity of the function

φ_{λ} (\cdot)

(cf. (31))–there exist parameters

p_{λ}^{L} > 0 = max {0, α_{λ}}

and

q_{λ}^{L} > 0 = max {0, β_{λ}}

which satisfy (56) and (34) (where the latter implies (35) and thus

p_{λ}^{L} \leq α_{A}^{λ} α_{H}^{1 - λ}, q_{λ}^{L} \leq β_{A}^{λ} β_{H}^{1 - λ}

). With

I_{SP, 3 a} : = I_{SP, 3 a}^{(\geq)} \cup I_{SP, 3 a}^{(<)}

(60)

and with the discussion below (56), we thus derive the following

Proposition 12.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times I_{SP, 3 a}

there exist parameters

p_{λ}^{L}, q_{λ}^{L}

which satisfy

max {0, α_{λ}} < p_{λ}^{L} \leq α_{A}^{λ} α_{H}^{1 - λ}, max {0, β_{λ}} < q_{λ}^{L} \leq β_{A}^{λ} β_{H}^{1 - λ}

as well as (35) for all

x \in N_{0}

, and for all such pairs

(p_{λ}^{L}, q_{λ}^{L})

and all initial population sizes

X_{0} \in N

one gets

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{L} = {\tilde{B}}_{λ, X_{0}, 1}^{(p_{λ}^{L}, q_{λ}^{L})} = exp \{(q_{λ}^{L} - β_{λ}) \cdot X_{0} + p_{λ}^{L} - α_{λ}\} > 1, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{L})}_{n \in N} of lower bounds for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{L} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} = exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{L} = \infty = lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}), \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{L} = \{\begin{matrix} p_{λ}^{L} \cdot exp \{x_{0}^{(q_{λ}^{L})}\} - α_{λ} > 0, & if q_{λ}^{L} \leq min \{1, e^{β_{λ} - 1}\}, \\ \infty, & if q_{λ}^{L} > min \{1, e^{β_{λ} - 1}\}, \end{matrix} \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{L} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} is strictly increasing . \end{matrix}

Notice that the assertions (a) to (e) of Proposition 12 hold true for parameter pairs

(p_{λ}^{L}, q_{λ}^{L})

whenever they satisfy (35) and (56); in particular, we may allow either

p_{λ}^{L} = max {0, α_{λ}}

or

q_{λ}^{L} = max {0, β_{λ}}

. Let us furthermore mention that in part (d) both asymptotical behaviours can occur: consider e.g., the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 0.2, 4, 3) \in P_{SP, 3 a}

, leading to

] 1, \infty [⊊ I_{SP, 3 a}^{(\geq)} ⊊ I_{SP, 3 a}

. For

λ = 2 \in I_{SP, 3 a}

, the parameters

p_{λ}^{L} : = {\tilde{p}}_{λ} : = 5.25, q_{λ}^{L} : = {\tilde{q}}_{λ} : = 0.45

(corresponding to the asymptote

{\tilde{ϕ}}_{λ} (\cdot)

, cf. (P20)) fulfill (35), (56) and additionally

q_{λ}^{L} = 0.45 < min {1, e^{β_{λ} - 1}} \approx 0.549

. Analogously, in the setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (3, 2, 4, 3, 2) \in P_{SP, 3 a} \times I_{SP, 3 a}

, the choices

p_{λ}^{L} : = {\tilde{p}}_{λ} : = 5.25, q_{λ}^{L} : = {\tilde{q}}_{λ} : = 4.5

satisfy (35), (56) and there holds

q_{λ}^{L} = 4.5 > min {1, e^{β_{λ} - 1}} = 1

.

For the remaining two cases

(α_{λ} \leq 0) \land (β_{λ} > 0)

(e.g.,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (6, 5, 3, 2, - 3)

) and

(α_{λ} > 0) \land (β_{λ} \leq 0)

(e.g.,

(β_{A}, β_{H}, α_{A}, α_{H}, λ) = (3, 2, 6, 5, - 3)

), one has to proceed differently. Indeed, for all parameter constellations

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times R \ (I_{SP, 3 a} \cup [0, 1])

, all observation time horizons

n \in N

and all initial population sizes

X_{0} \in N

one can still prove

1 < H_{λ} (P_{A, n} ∥ P_{H, n}), and lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty,

(61)

which is done in Appendix A.1, using a similar method as in the proof of assertion (59).

3.19. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times (R \ [0, 1])$

Within such a constellation, where

P_{SP, 3 b} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in] 0, \infty [\ N\}

(cf. (49)), one always has either

(α_{A} < α_{H}) \land (β_{A} > β_{H})

or

(α_{A} > α_{H}) \land (β_{A} < β_{H})

. Moreover, from Properties 3 (P15) one can see that

ϕ_{λ} (x) = 0

for

x = x^{*} = \frac{α_{H} - α_{A}}{β_{A} - β_{H}} > 0

. However,

x^{*} \notin N_{0}

, which implies

ϕ_{λ} (x) > 0

for all x on the relevant subdomain

N_{0}

. Again, we incorporate (57) and consider the set of all

λ \in R \ [0, 1]

such that

α_{λ} \geq 0

and

β_{λ} \geq 0

(where

α_{λ} = 0 \land β_{λ} = 0

cannot appear), i.e.,

λ \in I_{SP, 3 b}^{(\geq)} : = \{\begin{matrix} [\frac{- β_{H}}{β_{A} - β_{H}}, 0 [\cup] 1, \frac{α_{H}}{α_{H} - α_{A}}], & if (α_{A} < α_{H}) \land (β_{A} > β_{H}), \\ [\frac{- α_{H}}{α_{A} - α_{H}}, 0 [\cup] 1, \frac{β_{H}}{β_{H} - β_{A}}], & if (α_{A} > α_{H}) \land (β_{A} < β_{H}) . \end{matrix}

(62)

As above in Section 3.18, if

λ \in I_{SP, 3 b}^{(\geq)}

then there exist parameters

p_{λ}^{L} \in] α_{λ}, α_{A}^{λ} α_{H}^{1 - λ}]

,

q_{λ}^{L} \in] β_{λ}, β_{A}^{λ} β_{H}^{1 - λ}]

(which thus fulfill (56)) such that (35) is satisfied for all

x \in N_{0}

. Hence, for all

λ \in I_{SP, 3 b} : = I_{SP, 3 b}^{(\geq)}

, all assertions (a) to (e) of Proposition 12 hold true. Notice that for the current setup

P_{SP, 3 b}

one cannot have

α_{λ} \leq 0

and

β_{λ} \leq 0

simultaneously. Furthermore, in each of the two remaining cases

(α_{λ} < 0) \land (β_{λ} > 0)

respectively

(α_{λ} > 0) \land (β_{λ} < 0)

it can happen that there do not exist parameters

p_{λ}^{L}, q_{λ}^{L} > 0

which satisfy both (35) and (56). However, as in the case

P_{SP, 3 a}

above, for all

λ \notin I_{SP, 3 b}

we prove in Appendix A.1 (by a method without

p_{λ}^{L}, q_{λ}^{L}

) that for all observation times

n \in N

and all initial population sizes

X_{0} \in N

there holds

1 < H_{λ} (P_{A, n} ∥ P_{H, n}) and lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty .

(63)

3.20. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times (R \ [0, 1])$

Since in this subcase one has

P_{SP, 3 c} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H}, β_{A} \neq β_{H}, \frac{α_{A}}{β_{A}} \neq \frac{α_{H}}{β_{H}}, \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in N\}

(cf. (49)) and thus

ϕ_{λ} (x^{*}) = 0

for

x^{*} \in N

, there do not exist parameters

p_{λ}^{L}, q_{λ}^{L}

such that (35) and (56) are satisfied. The only parameter pair that ensures

exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\} \geq 1

for all

n \in N

and all

X_{0} \in N

within our proposed method, is the choice

p_{λ}^{L} = α_{λ}, q_{λ}^{L} = β_{λ}

. Consequently,

B_{λ, X_{0}, n}^{L} \equiv 1

, which coincides with the general lower bound (11) but violates the above-mentioned desired Goal (G1

^{'}

). However, in some constellations there exist nonnegative parameters

p_{λ}^{L} < α_{λ}, q_{λ}^{L} > β_{λ}

or

p_{λ}^{L} > α_{λ}, q_{λ}^{L} < β_{λ}

, such that at least the parts (c) and (d) of Proposition 12 are satisfied. As in Section 3.19 above, by using a conceptually different method (without

p_{λ}^{L}, q_{λ}^{L}

) we prove in Appendix A.1 that for all

λ \in R \ [0, 1]

, all observation times

n \in N

and all initial population sizes

X_{0} \in N

there holds

1 < H_{λ} (P_{A, n} ∥ P_{H, n}) and lim_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty .

(64)

3.21. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times (R \ [0, 1])$

In the current setup, where

P_{SP, 4 a} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H} > 0, β_{A} = β_{H} \in] 0, 1 [\}

(cf. (49)), the function

ϕ_{λ} (\cdot)

is strictly positive and strictly decreasing, with

{lim}_{x \to \infty} ϕ_{λ} (x) = {lim}_{x \to \infty} ϕ_{λ}^{'} (x) = 0

. The only choice of parameters

p_{λ}^{L}, q_{λ}^{L}

which fulfill (35) and

exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\} \geq 1

for all

n \in N

and all

X_{0} \in N

, is the choice

p_{λ}^{L} = α_{λ}

as well as

q_{λ}^{L} = β_{λ} = β_{•}

, where

β_{•}

stands for both (equal)

β_{H}

and

β_{A}

. Of course, this leads to

B_{λ, X_{0}, n}^{L} \equiv 1

, which is consistent with the general lower bound (11), but violates the above-mentioned desired Goal (G1

^{'}

). Nevertheless, in Appendix A.1 we prove the following

Proposition 13.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times R \ [0, 1]

there exist parameters

p_{λ}^{L} > α_{λ}

(not necessarily satisfying

p_{λ}^{L} \geq 0

) and

0 < q_{λ}^{L} < β_{λ} = β_{•} < min {1, e^{β_{•} - 1}} = e^{β_{•} - 1}

such that (35) holds for all

x \in [0, \infty [

and such that for all initial population sizes

X_{0} \in N

the parts (c) and (d) of Proposition 12 hold true.

3.22. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times (R \ [0, 1])$

By recalling

P_{SP, 4 b} : = \{(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} : α_{A} \neq α_{H} > 0, β_{A} = β_{H} \in [1, \infty [\}

(cf.(49)), the assertions preceding Proposition 13 remain valid. However, the proof of Proposition 13 in Appendix A.1 contains details which explain why it cannot be carried over to the current case

P_{SP, 4 b}

. Thus, the generally valid lower bound

B_{λ, X_{0}, n}^{L} \equiv 1

cannot be improved with our methods.

3.23. Concluding Remarks on Alternative Lower Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

To achieve the Goals (G1

^{'}

) to (G3

^{'}

), in the above-mentioned investigations about lower bounds of the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

,

λ \in R \ [0, 1]

, we have mainly focused on parameters

p_{λ}^{L}, q_{λ}^{L}

which satisfy (35) and additionally (56). Nevertheless, Theorem 1 (b) gives lower bounds

B_{λ, X_{0}, n}^{L}

whenever (35) is fulfilled. However, this lower bound can be the trivial one,

B_{λ, X_{0}, n}^{L} \equiv 1

. Let us remark here that for the parameter constellations

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \times R \ ([0, 1] \cup I_{SP, 2})) \cup (P_{SP, 3 a} \times R \ ([0, 1] \cup I_{SP, 3 a})) \cup (P_{SP, 3 b} \times R \ ([0, 1] \cup I_{SP, 3 b}))

one can prove that there exist

p_{λ}^{L}, q_{λ}^{L}

which satisfy (35) for all

x \in N_{0}

as well as the condition (generalizing (56))

p_{λ}^{L} \geq α_{λ}, q_{λ}^{L} \geq β_{λ}, (where at least one of the inequalities is strict),

and that for such

p_{λ}^{L}, q_{λ}^{L}

one gets the validity of

H_{λ} (P_{A, n} ∥ P_{H, n}) \geq B_{λ, X_{0}, n}^{L} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} > 1

for all

X_{0} \in N

and all

n \in N

; consequently, Goal (G1

^{'}

) is achieved. However, in these parameter constellations it can unpleasantly happen that

n \mapsto B_{λ, X_{0}, n}^{L}

is oscillating (in contrast to the monotone behaviour in the Propositions 11 (b), 12 (b)).

As a final general remark, let us mention that the functions

ϕ_{λ, y}^{\tan} (\cdot)

,

ϕ_{λ, k}^{\sec} (\cdot)

,

ϕ_{λ}^{hor} (\cdot)

,

\tilde{ϕ_{λ}} (\cdot)

–defined in (52)–(54) and Properties 3 (P20)–constitute linear lower bounds for

ϕ_{λ} (\cdot)

on the domain

N_{0}

in the case

λ \in R \ [0, 1]

. Their parameters

p_{λ}^{L} \in \{p_{λ, y}^{\tan}, p_{λ, y}^{\sec}, p_{λ, y}^{hor}, \tilde{p_{λ}}\}

and

q_{λ}^{L} \in \{q_{λ, y}^{\tan}, q_{λ, y}^{\sec}, q_{λ, y}^{hor}, \tilde{q_{λ}}\}

lead to lower bounds

B_{λ, X_{0}, n}^{L}

of the Hellinger integrals that may or may not be consistent with Goals (G1

^{'}

) to (G3

^{'}

), and which may be possibly better respectively weaker respectively incomparable with the previous lower bounds when adding some relaxation of (G1

^{'}

), such as e.g., the validity of

H_{λ} (P_{A, n} ∥ P_{H, n}) > 1

for all but finitely many

n \in N

.

3.24. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

For the cases

λ \in R \ [0, 1]

, the investigation of upper bounds for the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

is much easier than the above-mentioned derivations of lower bounds. In fact, we face a situation which is similar to the lower-bounds-studies for the cases

λ \in] 0, 1 [

: due to Properties 3 (P19), the function

ϕ_{λ} (\cdot)

is strictly convex on the nonnegative real line. Furthermore, it is asymptotically linear, as stated in (P20). The monotonicity Properties 2 (P10) to (P12) imply that for the tightest upper bound (within our framework) one should use the parameters

p_{λ}^{U} : = α_{A}^{λ} α_{H}^{1 - λ} > 0

and

q_{λ}^{U} : = β_{A}^{λ} β_{H}^{1 - λ} > 0

. Lemma A1 states that

p_{λ}^{U} \geq α_{λ}

resp.

q_{λ}^{U} \geq β_{λ}

, with equality iff

α_{A} = α_{H}

resp. iff

β_{A} = β_{H}

. From Properties 1 (P3a) we see that for

β_{A} \neq β_{H}

the corresponding sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

is convergent to

x_{0}^{(q_{λ}^{U})} \in] 0, - log (q_{λ}^{U})]

if

q_{λ}^{U} \leq min {1, e^{β_{λ} - 1}}

(i.e., if

λ \in [λ_{-}, λ_{+}]

, cf. Lemma 1 (a)), and otherwise it diverges to ∞ faster than exponentially (cf. (P3b)). If

β_{A} = β_{H}

(i.e., if

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP, 4} = P_{SP, 4 a} \cup P_{SP, 4 b}

), then one gets

q_{λ}^{U} = β_{λ}

and

a_{n}^{(q_{λ}^{U})} = 0 = x_{0}^{(q_{λ}^{U})}

for all

n \in N

(cf. (P2)). Altogether, this leads to

Proposition 14.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{U} : = α_{A}^{λ} α_{H}^{1 - λ}, q_{λ}^{U} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{U} = {\tilde{B}}_{λ, X_{0}, 1}^{(p_{λ}^{U}, q_{λ}^{U})} = exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0} + α_{A}^{λ} α_{H}^{1 - λ} - α_{λ}\} > 1, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{U})}_{n \in N} of upper bounds for H_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{U} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} = exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{U} = \infty, \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{U} = \{\begin{matrix} p_{λ}^{U} \cdot exp \{x_{0}^{(q_{λ}^{U})}\} - α_{λ} > 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{U} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} is strictly increasing . \end{matrix}

4. Power Divergences of Non-Kullback-Leibler-Information-Divergence Type

4.1. A First Basic Result

For orders

λ \in R \ {0, 1}

, all the results of the previous Section 3 carry correspondingly over from the Hellinger integrals

H_{λ} (\cdot ∥ \cdot)

to the total variation distance

V (\cdot | | \cdot)

, by virtue of the relation (cf. (12))

2 (1 - H_{\frac{1}{2}} (P_{A, n} ∥ P_{H, n})) \leq V (P_{A, n} ∥ P_{H, n}) \leq 2 \sqrt{1 - {(H_{\frac{1}{2}} (P_{A, n} ∥ P_{H, n}))}^{2}},

to the Renyi divergences

R_{λ} (\cdot ∥ \cdot)

, by virtue of the relation (cf. (7))

0 \leq R_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (λ - 1)} log H_{λ} (P_{A, n} ∥ P_{H, n}), with log 0 : = - \infty,

as well as to the power divergences

I_{λ} (\cdot ∥ \cdot)

, by virtue of the relation (cf. (2))

I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1 - H_{λ} (P_{A, n} ∥ P_{H, n})}{λ \cdot (1 - λ)}, n \in N;

in the following, we concentrate on the latter. In particular, the above-mentioned carrying-over procedure leads to bounds on

I_{λ} (P_{A} ∥ P_{H})

which are tighter than the general rudimentary bounds (cf. (10) and (11))

0 \leq I_{λ} (P_{A, n} ∥ P_{H, n}) < \frac{1}{λ (1 - λ)}, for λ \in] 0, 1 [, 0 \leq I_{λ} (P_{A, n} ∥ P_{H, n}) \leq \infty, for λ \in R \ [0, 1] .

Because power divergences have a very insightful interpretation as “directed distances” between two probability distributions (e.g., within our running-example epidemiological context), and function as important tools in statistics, information theory, machine learning, and artificial intelligence, we present explicitly the outcoming exact values respectively bounds of

I_{λ} (P_{A} ∥ P_{H})

(

λ \in R \ {0, 1}

,

n \in N

), in the current and the following subsections. For this, recall the case-dependent parameters

p^{A} = p_{λ}^{A} = p^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q^{A} = q_{λ}^{A} = q^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

(

A \in {E, L, U}

). To begin with, we can deduce from Theorem 1

Theorem 2.

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})$ , all initial population sizes $X_{0} \in N_{0}$ , all observation horizons $n \in N$ and all $λ \in R \ {0, 1}$ one can recursively compute the exact value

$I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (λ - 1)} \cdot [exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\} - 1] = : V_{λ, X_{0}, n}^{I},$

(65)

where $\frac{α_{A}}{β_{A}}$ can be equivalently replaced by $\frac{α_{H}}{β_{H}}$ and $q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}$ . Notice that on $P_{NI}$ the formula (65) simplifies significantly, since $α_{A} = α_{H} = 0$ .
(b): For general parameters $p \in R$ , $q \neq 0$ recall the general expression (cf. (42))

${\tilde{B}}_{λ, X_{0}, n}^{(p, q)} : = exp \{a_{n}^{(q)} \cdot X_{0} + \frac{p}{q} \sum_{k = 1}^{n} a_{k}^{(q)} + n \cdot (\frac{p}{q} β_{λ} - α_{λ})\}$

as well as

${\tilde{B}}_{λ, X_{0}, n}^{(p, 0)} : = exp \{- β_{λ} \cdot X_{0} + (p \cdot e^{- β_{λ}} - α_{λ}) \cdot n\} .$

Then, for all $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$ , all $λ \in R \ {0, 1}$ , all coefficients $p_{λ}^{L}, p_{λ}^{U}, q_{λ}^{L}, q_{λ}^{U} \in R$ which satisfy (35) for all $x \in N_{0}$ , all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ one gets the following recursive bounds for the power divergences: for $λ \in] 0, 1 [$ there holds

$I_{λ} (P_{A, n} ∥ P_{H, n}) \{\begin{matrix} < & \frac{1}{λ (1 - λ)} \cdot (1 - B_{λ, X_{0}, n}^{L}) = \frac{1}{λ (1 - λ)} \cdot (1 - {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})}) = : B_{λ, X_{0}, n}^{I, U}, \\ \geq & \frac{1}{λ (1 - λ)} \cdot (1 - B_{λ, X_{0}, n}^{U}) = \frac{1}{λ (1 - λ)} \cdot (1 - min \{{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}, 1\}) = : B_{λ, X_{0}, n}^{I, L}, \end{matrix}$

whereas for $λ \in R \ [0, 1]$ there holds

$I_{λ} (P_{A, n} ∥ P_{H, n}) \{\begin{matrix} < & \frac{1}{λ (λ - 1)} \cdot (B_{λ, X_{0}, n}^{U} - 1) = \frac{1}{λ (λ - 1)} \cdot ({\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} - 1) = : B_{λ, X_{0}, n}^{I, U}, \\ \geq & \frac{1}{λ (λ - 1)} \cdot (B_{λ, X_{0}, n}^{L} - 1) = \frac{1}{λ (λ - 1)} \cdot (max \{{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})}, 1\} - 1) = : B_{λ, X_{0}, n}^{I, L} . \end{matrix}$

In order to deduce the subsequent detailed recursive analyses of power divergences, we also employ the obvious relations

\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (\frac{1}{λ (1 - λ)} - I_{λ} (P_{A, n} ∥ P_{H, n})) = lim_{n \to \infty} \frac{1}{n} [- log (λ (1 - λ)) + log (H_{λ} (P_{A, n} ∥ P_{H, n}))] \\ = & lim_{n \to \infty} \frac{1}{n} log (H_{λ} (P_{A, n} ∥ P_{H, n})), for λ \in] 0, 1 [, \end{matrix}

(66)

as well as

\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (I_{λ} (P_{A, n} ∥ P_{H, n})) = lim_{n \to \infty} \frac{1}{n} [- log (λ (λ - 1)) + log (H_{λ} (P_{A, n} ∥ P_{H, n}) - 1)] \\ = & lim_{n \to \infty} \frac{1}{n} [log (1 - \frac{1}{H_{λ} (P_{A, n} | | P_{H, n})}) + log (H_{λ} (P_{A, n} ∥ P_{H, n}))] = lim_{n \to \infty} \frac{1}{n} log (H_{λ} (P_{A, n} ∥ P_{H, n})), \end{matrix}

(67)

for

λ \in R \ [0, 1]

(provided that

{lim inf}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) > 1

).

4.2. Detailed Analyses of the Exact Recursive Values of $I_{λ} (\cdot ∥ \cdot)$ , i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$

Corollary 2.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{NI} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & I_{λ} (P_{A, 1} ∥ P_{H, 1}) = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0}\}) > 0, \\ (b) & the sequence {(I_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0}\}) = : V_{λ, X_{0}, n}^{I} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{x_{0}^{(q_{λ}^{E})} \cdot X_{0}\}) \in] 0, \frac{1}{λ (1 - λ)} [, \\ (d) & lim_{n \to \infty} \frac{1}{n} log (\frac{1}{λ (1 - λ)} - I_{λ} (P_{A, n} ∥ P_{H, n})) = lim_{n \to \infty} \frac{1}{n} log H_{λ} (P_{A, n} ∥ P_{H, n}) = 0, \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n}^{I} is strictly increasing . \end{matrix}

Corollary 3.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{NI} \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & I_{λ} (P_{A, 1} ∥ P_{H, 1}) = \frac{1}{λ (λ - 1)} \cdot (exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0}\} - 1) > 0, \\ (b) & the sequence {(I_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (λ - 1)} \cdot (exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0}\} - 1) = : V_{λ, X_{0}, n}^{I} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} \frac{1}{λ (λ - 1)} \cdot (exp \{x_{0}^{(q_{λ}^{E})} \cdot X_{0}\} - 1) > 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (d) & lim_{n \to \infty} \frac{1}{n} log I_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n}^{I} is strictly increasing . \end{matrix}

Corollary 4.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 1} \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & I_{λ} (P_{A, 1} ∥ P_{H, 1}) = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot (X_{0} + \frac{α_{A}}{β_{A}})\}) > 0, \\ (b) & the sequence {(I_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\}) = : V_{λ, X_{0}, n}^{I} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (1 - λ)}, \\ (d) & lim_{n \to \infty} \frac{1}{n} log (\frac{1}{λ (1 - λ)} - I_{λ} (P_{A, n} ∥ P_{H, n})) = \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})} < 0, \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n}^{I} is strictly increasing . \end{matrix}

Corollary 5.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 1} \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & I_{λ} (P_{A, 1} ∥ P_{H, 1}) = \frac{1}{λ (λ - 1)} \cdot (exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot (X_{0} + \frac{α_{A}}{β_{A}})\} - 1) > 0, \\ (b) & the sequence {(I_{λ} (P_{A, n} ∥ P_{H, n}))}_{n \in N} given by \\ I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (λ - 1)} \cdot (exp \{a_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{E})}\} - 1) = : V_{λ, X_{0}, n}^{I} \\ is strictly increasing, \\ (c) & lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \infty, \\ (d) & lim_{n \to \infty} \frac{1}{n} log I_{λ} (P_{A, n} ∥ P_{H, n}) = \{\begin{matrix} \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})} > 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto V_{λ, X_{0}, n}^{I} is strictly increasing . \end{matrix}

In the assertions (a), (b), (d) of the Corollaries 4 and 5 the fraction

α_{A} / β_{A}

can be equivalently replaced by

α_{H} / β_{H}

.

Let us now derive the corresponding detailed results for the bounds of the power divergences for the parameter cases

P_{SP} \ P_{SP, 1}

, where the Hellinger integral, and thus

I_{λ} (P_{A, n} ∥ P_{H, n})

, cannot be determined exactly. The extensive discussion on the Hellinger-integral bounds in the Section 3.4, Section 3.5, Section 3.6, Section 3.7, Section 3.8, Section 3.9, Section 3.10, Section 3.11, Section 3.12 and Section 3.13, as well as in the Section 3.16, Section 3.17, Section 3.18, Section 3.19, Section 3.20, Section 3.21, Section 3.22, Section 3.23 and Section 3.24 can be carried over directly to obtain power-divergence bounds. In the following, we summarize the outcoming key results, referring a detailed discussion on the possible choices of

p_{λ}^{A} = p^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

and

q_{λ}^{A} = q^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

(

A \in {L, U}

) to the corresponding above-mentioned subsections.

4.3. Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

Corollary 6.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b}) \times] 0, 1 [

there exist parameters

p_{λ}^{U}, q_{λ}^{U}

which satisfy

p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ}]

and

q_{λ}^{U} \in [β_{A}^{λ} β_{H}^{1 - λ}, β_{λ} [

as well as (35) for all

x \in N_{0}

, and for all such pairs

(p_{λ}^{U}, q_{λ}^{U})

and all initial population sizes

X_{0} \in N

there holds

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{I, L} = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{(q_{λ}^{U} - β_{λ}) \cdot X_{0} + p_{λ}^{U} - α_{λ}\}) > 0, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{I, L})}_{n \in N} of lower bounds for I_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{I, L} = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\}) \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{I, L} = lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (1 - λ)}, \\ (d) & lim_{n \to \infty} \frac{1}{n} log (\frac{1}{λ (1 - λ)} - B_{λ, X_{0}, n}^{I, L}) = p_{λ}^{U} \cdot e^{x_{0}^{(q_{λ}^{U})}} - α_{λ} < 0, \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{I, L} is strictly increasing . \end{matrix}

Remark 4.

(a): Notice that in the case $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times] 0, 1 [$ –where $α_{A}^{λ} α_{H}^{1 - λ} = α_{λ} = α_{A} = α_{H} = α$ –we get the special choice $p_{λ}^{U} = α$ and $q_{λ}^{U} = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α$ (cf. Section 3.7). For the constellations $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 3 a} \cup P_{SP, 3 b}) \times] 0, 1 [$ there exist parameters $p_{λ}^{U} \in [α_{A}^{λ} α_{H}^{1 - λ}, α_{λ} [$ , $q_{λ}^{U} \in [β_{A}^{λ} β_{H}^{1 - λ}, β_{λ} [$ which satisfy (35) for all $x \in N_{0}$ .
(b): For the parameter setups $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b}) \times] 0, 1 [$ there might exist parameter pairs $(p_{λ}^{U}, q_{λ}^{U})$ satisfying (35) and either $p_{λ}^{U} = α_{λ}$ or $q_{λ}^{U} = β_{λ}$ , for which all assertions of Corollary 6 still hold true.
(c): Following the discussion in Section 3.10 for all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times] 0, 1 [$ at least part (c) still holds true.

Corollary 7.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [

there exist parameters

p_{λ}^{U} < α_{λ}

,

1 > q_{λ}^{U} > β_{λ} = β

such that (35) is satisfied for all

x \in [0, \infty [

and such that for all initial population sizes

X_{0} \in N

at least the parts (c) and (d) of Corollary 6 hold true.

As in Section 3.12, for the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [

we cannot derive a lower bound for the power divergences which improves the generally valid lower bound

I_{λ} (P_{A, n} ∥ P_{H, n}) \geq 0

(cf. (10)) by employing our proposed (

p_{λ}^{U}, q_{λ}^{U}

)-method.

4.4. Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

Since in this setup the upper bounds of the power divergences can be derived from the lower bounds of the Hellinger integrals, we here appropriately adapt the results of Proposition 6.

Corollary 8.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{L} : = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{L} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{I, U} = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0} + α_{A}^{λ} α_{H}^{1 - λ} - α_{λ}\}) > 0, \\ (b) & the sequence of upper bounds {(B_{λ, X_{0}, n}^{I, U})}_{n \in N} for I_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{I, U} = \frac{1}{λ (1 - λ)} \cdot (1 - exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \frac{p_{λ}^{L}}{q_{λ}^{L}} \sum_{k = 1}^{n} a_{k}^{(q_{λ}^{L})} + n \cdot (\frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot β_{λ} - α_{λ})\}) \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{I, U} = \frac{1}{λ (1 - λ)}, \\ (d) & lim_{n \to \infty} \frac{1}{n} log (\frac{1}{λ (1 - λ)} - B_{λ, X_{0}, n}^{I, U}) = \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (x_{0}^{(q_{λ}^{L})} + β_{λ}) - α_{λ} = p_{λ}^{L} \cdot e^{x_{0}^{(q_{λ}^{L})}} - α_{λ} < 0, \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{I, U} is strictly increasing . \end{matrix}

4.5. Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

In order to derive detailed results on lower bounds of the power divergences in the case

λ \in R \ [0, 1]

, we have to subsume and adapt the Hellinger-integral concerning lower-bounds investigations from the Section 3.16, Section 3.17, Section 3.18, Section 3.19, Section 3.20, Section 3.21, Section 3.22 and Section 3.23. Recall the

λ

-sets

I_{SP, 2}, I_{SP, 3 a}, I_{SP, 3 b}

(cf. (58), (60), (62)). For the constellations

P_{SP, 2} \times I_{SP, 2}

we employ the special choice

p_{λ}^{L} = α_{A}^{λ} α_{H}^{1 - λ} = α_{λ} = α_{A} = α_{H} = α

together with

q_{λ}^{L} = {(α + β_{A})}^{λ} {(α + β_{H})}^{1 - λ} - α > max {0, β_{λ}}

(cf. (58)) which satisfy (35) for all

x \in N_{0}

and (56), whereas for the constellations

(P_{SP, 3 a} \times I_{SP, 3 a})

\cup (P_{SP, 3 b} \times I_{SP, 3 b})

we have proved the existence of parameters

p_{λ}^{L}, q_{λ}^{L}

satisfying both (35) for all

x \in N_{0}

and (56) with two strict inequalities. Subsuming this, we obtain

Corollary 9.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \times I_{SP, 2})

\cup (P_{SP, 3 a} \times I_{SP, 3 a})

\cup (P_{SP, 3 b} \times I_{SP, 3 b})

there exist parameters

p_{λ}^{L}, q_{λ}^{L}

which satisfy

max {0, α_{λ}} \leq p_{λ}^{L} \leq α_{A}^{λ} α_{H}^{1 - λ}, max {0, β_{λ}} < q_{λ}^{L} \leq β_{A}^{λ} β_{H}^{1 - λ}

as well as (35) for all

x \in N_{0}

, and for all such pairs

(p_{λ}^{L}, q_{λ}^{L})

and all initial population sizes

X_{0} \in N

one gets

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{I, L} = \frac{1}{λ (λ - 1)} \cdot (exp \{(q_{λ}^{L} - β_{λ}) \cdot X_{0} + p_{λ}^{L} - α_{λ}\} - 1) > 0, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{I, L})}_{n \in N} of lower bounds for I_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{I, L} = \frac{1}{λ (λ - 1)} \cdot (exp \{a_{n}^{(q_{λ}^{L})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{L}, q_{λ}^{L})}\} - 1) \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{I, L} = lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \infty, \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{I, L} = \{\begin{matrix} p_{λ}^{L} \cdot exp \{x_{0}^{(q_{λ}^{L})}\} - α_{λ} > 0, & if q_{λ}^{L} \leq min \{1; e^{β_{λ} - 1}\}, \\ \infty, & if q_{λ}^{L} > min \{1; e^{β_{λ} - 1}\}, \end{matrix} \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{I, L} is strictly increasing . \end{matrix}

Analogously to the discussions in the Section 3.17, Section 3.18, Section 3.19 and Section 3.20, for the parameter setups

(P_{SP, 2} \times R \ (I_{SP, 2} \cup [0, 1]))

\cup (P_{SP, 3 a} \times R \ (I_{SP, 3 a} \cup [0, 1])) \cup (P_{SP, 3 b} \times R \ (I_{SP, 3 b} \cup [0, 1])) \cup (P_{SP, 3 c} \times R \ [0, 1])

and for all initial population sizes

X_{0} \in N

one can still show

0 < I_{λ} (P_{A, n} ∥ P_{H, n}), and lim_{n \to \infty} I_{λ} (P_{A, n} ∥ P_{H, n}) = \infty .

For the penultimate case we obtain

Corollary 10.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times (R \ [0, 1])

there exist parameters

p_{λ}^{L} > α_{λ}

(where not necessarily

p_{λ}^{L} \geq 0

) and

0 < q_{λ}^{L} < β_{λ} = β_{•} < min {1, e^{β_{•} - 1}} = e^{β_{•} - 1}

such that (35) is satisfied for all

x \in [0, \infty [

and such that for all initial population sizes

X_{0} \in N

at least the parts (c) and (d) of Corollary 9 hold true.

Notice that for the last case

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times R \ [0, 1]

(where (

β_{A} = β_{H} \geq 1

) we cannot derive lower bounds of the power divergences which improve the generally valid lower bound

I_{λ} (P_{A, n} ∥ P_{H, n}) \geq 0

(cf. (11)) by employing our proposed (

p_{λ}^{U}, q_{λ}^{U}

)-method.

4.6. Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

For these constellations we adapt Proposition 14, which after modulation becomes

Corollary 11.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])

and all initial population sizes

X_{0} \in N

there holds with

p_{λ}^{U} : = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{U} : = β_{A}^{λ} β_{H}^{1 - λ}

\begin{matrix} (a) & B_{λ, X_{0}, 1}^{I, U} = \frac{1}{λ (λ - 1)} \cdot (exp \{(β_{A}^{λ} β_{H}^{1 - λ} - β_{λ}) \cdot X_{0} + α_{A}^{λ} α_{H}^{1 - λ} - α_{λ}\} - 1) > 0, \\ (b) & the sequence {(B_{λ, X_{0}, n}^{I, U})}_{n \in N} of upper bounds for I_{λ} (P_{A, n} ∥ P_{H, n}) given by \\ B_{λ, X_{0}, n}^{I, U} = \frac{1}{λ (λ - 1)} \cdot (exp \{a_{n}^{(q_{λ}^{U})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}^{U}, q_{λ}^{U})}\} - 1) \\ is strictly increasing, \\ (c) & lim_{n \to \infty} B_{λ, X_{0}, n}^{I, U} = \infty, \\ (d) & lim_{n \to \infty} \frac{1}{n} log B_{λ, X_{0}, n}^{I, U} = \{\begin{matrix} p_{λ}^{U} \cdot exp \{x_{0}^{(q_{λ}^{U})}\} - α_{λ} > 0, & if λ \in [λ_{-}, λ_{+}] \ [0, 1], \\ \infty, & if λ \in] - \infty, λ_{-} [\cup] λ_{+}, \infty [, \end{matrix} \\ (e) & the map X_{0} \mapsto B_{λ, X_{0}, n}^{I, U} is strictly increasing . \end{matrix}

4.7. Applications to Bayesian Decision Making

As explained in Section 2.5, the power divergences fulfill

I_{λ} (P_{A, n} ∥ P_{H, n}) = \int_{0}^{1} Δ {BR}_{\tilde{LO}} (p_{A}^{prior}) \cdot {(1 - p_{A}^{prior})}^{λ - 2} \cdot {(p_{A}^{prior})}^{- 1 - λ} d p_{A}^{prior}, λ \in R, (cf . (21)),

and

I_{λ} (P_{A, n} ∥ P_{H, n}) = lim_{χ \to p_{A}^{prior}} Δ {BR}_{{LO}_{λ, χ}} (p_{A}^{prior}), λ \in] 0, 1 [, (cf . (22)),

and thus can be interpreted as (i) weighted-average decision risk reduction (weighted-average statistical information measure) about the degree of evidence

d e g

concerning the parameter

θ

that can be attained by observing the GWI-path

X_{n}

until stage n, and as (ii) limit decision risk reduction (limit statistical information measure). Hence, by combining (21) and (22) with the investigations in the previous Section 4.1, Section 4.2, Section 4.3, Section 4.4, Section 4.5 and Section 4.6, we obtain exact recursive values respectively recursive bounds of the above-mentioned decision risk reductions. For the sake of brevity, we omit the details here.

5. Kullback-Leibler Information Divergence (Relative Entropy)

5.1. Exact Values Respectively Upper Bounds of $I (\cdot | | \cdot)$

From (2), (3) and (6) in Section 2.4, one can immediately see that the Kullback-Leibler information divergence (relative entropy) between two competing Galton-Watson processes without/with immigration can be obtained by the limit

I (P_{A, n} ∥ P_{H, n}) = lim_{λ ↗ 1} I_{λ} (P_{A, n} ∥ P_{H, n}),

(68)

and the reverse Kullback-Leibler information divergence (reverse relative entropy) by

I (P_{H, n} ∥ P_{A, n}) = {lim}_{λ ↘ 0} I_{λ} (P_{A, n} ∥ P_{H, n})

. Hence, in the following we concentrate only on (68), the reverse case works analogously. Accordingly, we can use (68) in appropriate combination with the

λ \in] 0, 1 [

-parts of the previous Section 4 (respectively, the corresponding parts of Section 3) in order to obtain detailed analyses for

I (P_{H, n} ∥ P_{A, n})

. Let us start with the following assertions on exact values respectively upper bounds, which will be proved in Appendix A.2:

Theorem 3.

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})$ , all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ the Kullback-Leibler information divergence (relative entropy) is given by

$I (P_{A, n} ∥ P_{H, n}) = I_{X_{0}, n} : = \{\begin{matrix} \frac{β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \cdot (1 - {(β_{A})}^{n}) \\ + \frac{α_{A} \cdot [β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}]}{β_{A} (1 - β_{A})} \cdot n, & if β_{A} \neq 1, \\ [β_{H} - log β_{H} - 1] \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n], & if β_{A} = 1 . \end{matrix}$

(69)
(b): For all $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$ , all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ there holds $I (P_{A, n} ∥ P_{H, n}) \leq E_{X_{0}, n}^{U}$ , where

$E_{X_{0}, n}^{U} : = \{\begin{matrix} \frac{β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \cdot (1 - {(β_{A})}^{n}) \\ + [\frac{α_{A} \cdot [β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}]}{β_{A} (1 - β_{A})} + α_{A} [log (\frac{α_{A} β_{H}}{α_{H} β_{A}}) - \frac{β_{H}}{β_{A}}] + α_{H}] \cdot n, & if β_{A} \neq 1, \\ [β_{H} - log β_{H} - 1] \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] \\ + [α_{A} [log (\frac{α_{A} β_{H}}{α_{H}}) - β_{H}] + α_{H}] \cdot n, & if β_{A} = 1 . \end{matrix}$

(70)

Remark 5.

(i) Notice that the exact values respectively upper bounds are in closed form (rather than in recursive form).

(ii) The

n -

behaviour of (the bounds of) the Kullback-Leibler information divergence/relative entropy

I (P_{A, n} ∥ P_{H, n})

in Theorem 3 is influenced by the following facts:

(a): $β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H} \geq 0$ with equality iff $β_{A} = β_{H}$ .
(b): In the case $β_{A} \neq 1$ of (70), there holds $\frac{α_{A} \cdot [β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}]}{β_{A} (1 - β_{A})} + α_{A} [log (\frac{α_{A} β_{H}}{α_{H} β_{A}}) - \frac{β_{H}}{β_{A}}] + α_{H} \geq 0$ , with equality iff $α_{A} = α_{H}$ and $β_{A} = β_{H}$ .

5.2. Lower Bounds of $I (\cdot | | \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{SP} \ P_{SP, 1})$

Again by using (68) in appropriate combination with the “

λ \in] 0, 1 [

-parts” of the previous Section 4 (respectively, the corresponding parts of Section 3), we obtain the following (semi-)closed-form lower bounds of

I (P_{H, n} ∥ P_{A, n})

:

Theorem 4.

For all

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}

, all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

I (P_{A, n} ∥ P_{H, n}) \geq E_{X_{0}, n}^{L} : = sup_{k \in N_{0}, y \in [0, \infty [} \{E_{y, X_{0}, n}^{L, t a n}, E_{k, X_{0}, n}^{L, s e c}, E_{X_{0}, n}^{L, h o r}\} \in [0, \infty [,

(71)

where for all

y \in [0, \infty [

we define the – possibly negatively valued– finite bound component

E_{y, X_{0}, n}^{L, \tan} : = \{\begin{matrix} [β_{A} log (\frac{α_{A} + β_{A} y}{α_{H} + β_{H} y}) + β_{H} (1 - \frac{α_{A} + β_{A} y}{α_{H} + β_{H} y})] \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \\ + [\frac{α_{A}}{β_{A} (1 - β_{A})} [β_{A} log (\frac{α_{A} + β_{A} y}{α_{H} + β_{H} y}) + β_{H} (1 - \frac{α_{A} + β_{A} y}{α_{H} + β_{H} y})] \\ + (α_{H} - α_{A} \frac{β_{H}}{β_{A}}) (1 - \frac{α_{A} + β_{A} y}{α_{H} + β_{H} y})] \cdot n, & if β_{A} \neq 1, \\ [log (\frac{α_{A} + y}{α_{H} + β_{H} y}) + β_{H} (1 - \frac{α_{A} + y}{α_{H} + β_{H} y})] \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] \\ + (α_{H} - α_{A} β_{H}) (1 - \frac{α_{A} + y}{α_{H} + β_{H} y}) \cdot n, & if β_{A} = 1, \end{matrix}

(72)

and for all

k \in N_{0}

the – possibly negatively valued– finite bound component

E_{k, X_{0}, n}^{L, \sec} : = \{\begin{matrix} [f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) + β_{H} - β_{A}] \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \\ + [\frac{α_{A}}{β_{A} (1 - β_{A})} (f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) + β_{H} - β_{A}) \\ - (f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)})) \cdot (k + \frac{α_{A}}{β_{A}}) \\ + f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) - \frac{α_{A} β_{H}}{β_{A}} + α_{H}] \cdot n, & if β_{A} \neq 1, \\ [f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) + β_{H} - 1] \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] \\ - [(f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)})) (k + α_{A}) \\ - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) + α_{A} β_{H} - α_{H}] \cdot n, & if β_{A} = 1 . \end{matrix}

(73)

Furthermore, on

P_{SP, 4}

we set

E_{X_{0}, n}^{L, h o r} : = 0

for all

n \in N

whereas on

P_{SP} \ (P_{SP, 1} \cup P_{SP, 4})

we define

E_{X_{0}, n}^{L, h o r} : = [(α_{A} + β_{A} z^{*}) \cdot [log (\frac{α_{A} + β_{A} z^{*}}{α_{H} + β_{H} z^{*}}) - 1] + α_{H} + β_{H} z^{*}] \cdot n,, n \in N,

(74)

with

z^{*} : = arg {max}_{x \in N_{0}} \{(α_{A} + β_{A} x) [- log (\frac{α_{A} + β_{A} x}{α_{H} + β_{H} x}) + 1] - (α_{H} + β_{H} x)\}

.

On

P_{SP} \ (P_{SP, 1} \cup P_{SP, 3 c})

one even gets

E_{X_{0}, n}^{L} > 0

for all

X_{0} \in N

and all

n \in N

.

For the subcase

P_{SP, 3 c}

, one obtains for each fixed

n \in N

and each fixed

X_{0} \in N

the strict positivity

E_{X_{0}, n}^{L} > 0

if

(\frac{\partial}{\partial y} E_{y, n}^{L, t a n}) (y^{*}) \neq 0

, where

y^{*} : = \frac{α_{A} - α_{H}}{β_{H} - β_{A}} \in N

and hence

\begin{matrix} (\frac{\partial}{\partial y} E_{y, X_{0}, n}^{L, t a n}) (y^{*}) \\ = \{\begin{matrix} - \frac{{(β_{A} - β_{H})}^{3}}{α_{A} β_{H} - α_{H} β_{A}} \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] - \frac{{(β_{A} - β_{H})}^{2}}{β_{A}} (1 + \frac{α_{A} (β_{A} - β_{H})}{(1 - β_{A}) (α_{A} β_{H} - α_{H} β_{A})}) \cdot n, & if β_{A} \neq 1, \\ - \frac{{(1 - β_{H})}^{3}}{α_{A} β_{H} - α_{H}} \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] - {(1 - β_{H})}^{2} \cdot n, & if β_{A} = 1 . \end{matrix} \end{matrix}

(75)

A proof of this theorem is given in in Appendix A.2.

Remark 6.

Consider the exemplary parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (\frac{1}{3}, \frac{2}{3}, 2, 1) \in P_{SP, 3 c}

; within our running-example epidemiological context of Section 2.3, this corresponds to a “semi-mild” infectious-disease-transmission situation

(H)

(with subcritical reproduction number

β_{H} = \frac{2}{3}

and importation mean of

α_{H} = 1

), whereas

(A)

describes a “mild” situation (with “low” subcritical

β_{A} = \frac{1}{3}

and

α_{A} = 2

). In the case of

X_{0} = 3

there holds

(\frac{\partial}{\partial y} E_{y, X_{0}, n}^{L, t a n}) (y^{*}) = 0

for all

n \in N

, whereas for

X_{0} \neq 3

one obtains

(\frac{\partial}{\partial y} E_{y, X_{0}, n}^{L, t a n}) (y^{*}) \neq 0

for all

n \in N

.

It seems that the optimization problem in (71) admits in general only an implicitly representable solution, and thus we have used the prefix “(semi-)” above. Of course, as a less tight but less involved explicit lower bound of the Kullback-Leibler information divergence (relative entropy)

I (P_{A, n} | | P_{H, n})

one can use any term of the form

max \{E_{y, X_{0}, n}^{L, t a n}, E_{k, X_{0}, n}^{L, s e c}, E_{X_{0}, n}^{L, h o r}\}

(

y \in [0, \infty [

,

k \in N_{0}

), as well as the following

Corollary 12.

(a) For all

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}

, all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

I (P_{A, n} ∥ P_{H, n}) \geq E_{X_{0}, n}^{L} \geq {\tilde{E}}_{X_{0}, n}^{L} : = max \{E_{\infty, X_{0}, n}^{L, t a n}, E_{0, X_{0}, n}^{L, s e c}, E_{X_{0}, n}^{L, h o r}\} \in [0, \infty [,

with

E_{X_{0}, n}^{L, h o r}

defined by (74), with – possibly negatively valued– finite bound component

E_{\infty, X_{0}, n}^{L, t a n} : = {lim}_{y \to \infty} E_{y, X_{0}, n}^{L, t a n}

, where

E_{\infty, X_{0}, n}^{L, t a n} : = \{\begin{matrix} \frac{β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \cdot (1 - {(β_{A})}^{n}) \\ + [\frac{α_{A} \cdot [β_{A} \cdot (log (\frac{β_{A}}{β_{H}}) - 1) + β_{H}]}{β_{A} (1 - β_{A})} + α_{A} (1 - \frac{β_{H}}{β_{A}}) + α_{H} (1 - \frac{β_{A}}{β_{H}})] \cdot n, & if β_{A} \neq 1, \\ [β_{H} - log β_{H} - 1] \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] \\ + [α_{A} (1 - β_{H}) + α_{H} (1 - \frac{1}{β_{H}})] \cdot n, & if β_{A} = 1, \end{matrix}

and –possibly negatively valued–finite bound component

E_{0, X_{0}, n}^{L, s e c} = \{\begin{matrix} [(α_{A} + β_{A}) \cdot log (\frac{α_{A} + β_{A}}{α_{H} + β_{H}}) - α_{A} \cdot log (\frac{α_{A}}{α_{H}}) + β_{H} - β_{A}] \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \\ + {\frac{α_{A}}{β_{A} (1 - β_{A})} ((α_{A} + β_{A}) \cdot log (\frac{α_{A} + β_{A}}{α_{H} + β_{H}}) - α_{A} \cdot log (\frac{α_{A}}{α_{H}})) - \frac{α_{A}}{1 - β_{A}} (1 - β_{H}) \\ - α_{A} (1 + \frac{α_{A}}{β_{A}}) \cdot log (\frac{α_{H} (α_{A} + β_{A})}{α_{A} (α_{H} + β_{H})}) + α_{H}} \cdot n, & if β_{A} \neq 1, \\ [(α_{A} + 1) \cdot log (\frac{α_{A} + 1}{α_{H} + β_{H}}) - α_{A} \cdot log (\frac{α_{A}}{α_{H}}) + β_{H} - 1] \cdot [n \cdot X_{0} + \frac{α_{A}}{2} \cdot n^{2}] \\ + {\frac{α_{A}}{2} [(α_{A} + 1) \cdot log (\frac{α_{A} + 1}{α_{H} + β_{H}}) - α_{A} \cdot log (\frac{α_{A}}{α_{H}}) - β_{H} - 1] \\ - α_{A} (1 + α_{A}) \cdot log (\frac{α_{H} (α_{A} + 1)}{α_{A} (α_{H} + β_{H})}) + α_{H}} \cdot n, & if β_{A} = 1 . \end{matrix}

For the cases

P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b}

one gets even

{\tilde{E}}_{X_{0}, n}^{L} > 0

for all

X_{0} \in N

and all

n \in N

.

5.3. Applications to Bayesian Decision Making

As explained in Section 2.5, the Kullback-Leibler information divergence fulfills

I (P_{A, n} ∥ P_{H, n}) = \int_{0}^{1} Δ {BR}_{\tilde{LO}} (p_{A}^{prior}) \cdot {(1 - p_{A}^{prior})}^{- 1} \cdot {(p_{A}^{prior})}^{- 2} d p_{A}^{prior}, (cf . (21) with λ = 1),

and thus can be interpreted as weighted-average decision risk reduction (weighted-average statistical information measure) about the degree of evidence

d e g

concerning the parameter

θ

that can be attained by observing the GWI-path

X_{n}

until stage n. Hence, by combining (21) with the investigations in the previous Section 5.1 and Section 5.2, we obtain exact values respectively bounds of the above-mentioned decision risk reductions. For the sake of brevity, we omit the details here.

6. Explicit Closed-Form Bounds of Hellinger Integrals

6.1. Principal Approach

Depending on the parameter constellation

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times (R \ {0, 1})

, for the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

we have derived in Section 3 corresponding lower/upper bounds respectively exact values–of recursive nature– which can be obtained by choosing appropriate

p = p_{λ}^{A} = p^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ), q = q_{λ}^{A} = q^{A} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

(

A \in {E, L, U}

) and by using those together with the recursion

{(a_{n}^{(q)})}_{n \in N}

defined by (36) as well as the sequence

{(b_{n}^{(p, q)})}_{n \in N}

obtained from

{(a_{n}^{(q)})}_{n \in N}

by the linear transformation (38). Both sequences are “stepwise fully evaluable” but generally seem not to admit a closed-form representation in the observation horizons n; consequently, the time-evolution

n \mapsto H_{λ} (P_{A, n} ∥ P_{H, n})

–respectively the time-evolution of the corresponding recursive bounds– can generally not be seen explicitly. On order to avoid this intransparency (at the expense of losing some precision) one can approximate (36) by a recursion that allows for a closed-form representation; by the way, this will also turn out to be useful for investigations concerning diffusion limits (cf. the next Section 7).

To explain the basic underlying principle, let us first assume some general

q \in] 0, β_{λ} [

and

λ \in] 0, 1 [

. With Properties 1 (P1) we see that the sequence

{(a_{n}^{(q)})}_{n \in N}

is strictly negative, strictly decreasing and converges to

x_{0}^{(q)} \in] - β_{λ}, q - β_{λ} [

. Recall that this sequence is obtained by the recursive application of the function

ξ_{λ}^{(q)} (x) : = q \cdot e^{x} - β_{λ}

, through

a_{1}^{(q)} = ξ_{λ}^{(q)} (0) = q - β_{λ} < 0

,

a_{n}^{(q)} = ξ_{λ}^{(q)} (a_{n - 1}^{(q)}) = q e^{a_{n - 1}^{(q)}} - β_{λ}

(cf. (36)). As a first step, we want to approximate

ξ_{λ}^{(q)} (\cdot)

by a linear function on the interval

[x_{0}^{(q)}, 0]

. Due to convexity (P9), this is done by using the tangent line of

ξ_{λ}^{(q)} (\cdot)

at

x_{0}^{(q)}

ξ_{λ}^{(q), T} (x) : = c^{(q), T} + d^{(q), T} \cdot x : = x_{0}^{(q)} (1 - q \cdot e^{x_{0}^{(q)}}) + q \cdot e^{x_{0}^{(q)}} \cdot x,

(76)

as a linear lower bound, and the secant line of

ξ_{λ}^{(q)} (\cdot)

across its arguments 0 and

x_{0}^{(q)}

ξ_{λ}^{(q), S} (x) : = c^{(q), S} + d^{(q), S} \cdot x : = q - β_{λ} + \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}} \cdot x,

(77)

as a linear upper bound. With the help of these functions, we can define the linear recursions

\begin{matrix} a_{0}^{(q), T} : = 0, a_{n}^{(q), T} : = ξ_{λ}^{(q), T} (a_{n - 1}^{(q), T}), n \in N, \end{matrix}

(78)

\begin{matrix} as well as & a_{0}^{(q), S} : = 0, a_{n}^{(q), S} : = ξ_{λ}^{(q), S} (a_{n - 1}^{(q), S}), n \in N . \end{matrix}

(79)

In the following, we will refer to these sequences as the rudimentary closed-form sequence-bounds.

Clearly, both sequences are strictly negative (on

N

), strictly decreasing, and one gets the sandwiching

a_{n}^{(q), T} < a_{n}^{(q)} \leq a_{n}^{(q), S}

(80)

for all

n \in N

, with equality on the right side iff

n = 1

(where

a_{1}^{(q)} = q - β_{λ} < 0

); moreover,

lim_{n \to \infty} a_{n}^{(q), T} = lim_{n \to \infty} a_{n}^{(q), S} = lim_{n \to \infty} a_{n}^{(q)} = x_{0}^{(q)} .

(81)

Furthermore, such linear recursions allow for a closed-form representation, namely

a_{n}^{(q), *} = \frac{c^{(q), *}}{1 - d^{(q), *}} \cdot (1 - {(d^{(q), *})}^{n}) = x_{0}^{(q)} \cdot (1 - {(d^{(q), *})}^{n}),

(82)

where the “ * ” stands for either S or T. Notice that this representation is valid due to

d^{(q), T}, d^{(q), S} \in] 0, 1 [

. So far, we have considered the case

q \in] 0, β_{λ} [

. If

q = β_{λ}

, then one can see from Properties 1 (P2) that

a_{n}^{(q)} \equiv 0

, which is also an explicitly given (though trivial) sequence. For the remaining case, where

q > β_{λ}

and thus

ξ_{λ}^{(q)} (0) = a_{1}^{(q)} = q - β_{λ} > 0

), we want to exclude

q \geq min \{1, e^{β_{λ} - 1}\}

for the following reasons. Firstly, if

q > min \{1, e^{β_{λ} - 1}\}

, then from (P3) we see that the sequence

{(a_{n}^{(q)})}_{n \in N}

is strictly increasing and divergent to ∞, at a rate faster than exponentially (P3b); but a linear recursion is too weak to approximate such a growth pattern. Secondly, if

q = min \{1, e^{β_{λ} - 1}\}

, then one necessarily gets

q = e^{β_{λ} - 1} < 1

(since we have required

q > β_{λ}

, and otherwise one obtains the contradiction

β_{λ} < q = 1 \leq e^{β_{λ} - 1}

). This means that the function

ξ_{λ}^{(q)} (\cdot)

now touches the straight line

i d (\cdot)

in the point

- log (q)

, i.e.,

ξ_{λ}^{(q)} (- log (q)) = - log (q)

. Our above-proposed method, namely to use the tangent line of

ξ_{λ}^{(q)} (\cdot)

at

x = x_{0}^{(q)} = - log (q)

as a linear lower bound for

ξ_{λ}^{(q)} (\cdot)

, leads then to the recursion

a_{n}^{(q), T} \equiv 0

(cf. (78)). This is due to the fact that the tangent line

ξ_{λ}^{(q), T} (\cdot)

is in the current case equivalent with the straight line

i d (\cdot)

. Consequently, (81) would not be satisfied.

Notice that in the case

β_{λ} < q < min \{1, e^{β_{λ} - 1}\}

, the above-introduced functions

ξ_{λ}^{(q), T} (\cdot), ξ_{λ}^{(q), S} (\cdot)

constitute again linear lower and upper bounds for

ξ_{λ}^{(q)} (\cdot)

, however, this time on the interval

[0, x_{0}^{(q)}]

. The sequences defined in (78) and (79) still fulfill the assertions (80) and (81), and additionally allow for the closed-form representation (82). Furthermore, let us mention that these rudimentary closed-form sequence-bounds can be defined analogously for

λ \in R \ [0, 1]

and either

0 < q < β_{λ}

, or

q = β_{λ}

, or

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

.

In a second step, we want to improve the above-mentioned linear (lower and upper) approximations of the sequence

a_{n}^{(q)}

by reducing the faced error within each iteration. To do so, in both cases of lower and upper approximates we shall employ context-adapted linear inhomogeneous difference equations of the form

\begin{matrix} {\tilde{a}}_{0} : = 0 & ; & {\tilde{a}}_{n} : = \tilde{ξ} ({\tilde{a}}_{n - 1}) + ρ_{n - 1}, n \in N, \end{matrix}

(83)

with

\begin{matrix} \tilde{ξ} (x) & : = & c + d \cdot x, x \in R, \end{matrix}

(84)

\begin{matrix} ρ_{n - 1} & : = & K_{1} \cdot ϰ^{n - 1} + K_{2} \cdot ν^{n - 1}, n \in N, \end{matrix}

(85)

for some constants

c \in R

,

d \in] 0, 1 [

,

K_{1}, K_{2}, ϰ, ν \in R

with

0 \leq ν < ϰ \leq d

. This will be applied to

c : = c^{(q), S}

,

c : = c^{(q), T}

,

d : = d^{(q), S}

and

d : = d^{(q), T}

later on. Meanwhile, let us first present some facts and expressions which are insightful for further formulations and analyses.

Lemma 2.

Consider the sequence

{({\tilde{a}}_{n})}_{n \in N_{0}}

defined in (83) to (85). If

0 \leq ν < ϰ < d

, then one gets the closed-form representation

{\tilde{a}}_{n} = {\tilde{a}}_{n}^{h o m} + {\tilde{c}}_{n} with {\tilde{a}}_{n}^{h o m} = c \cdot \frac{1 - d^{n}}{1 - d} and {\tilde{c}}_{n} = K_{1} \cdot \frac{d^{n} - ϰ^{n}}{d - ϰ} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν},

(86)

which leads for all

n \in N

to

\sum_{k = 1}^{n} {\tilde{a}}_{k} = (\frac{K_{1}}{d - ϰ} + \frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \cdot \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{1} \cdot ϰ \cdot (1 - ϰ^{n})}{(d - ϰ) (1 - ϰ)} - \frac{K_{2} \cdot ν \cdot (1 - ν^{n})}{(d - ν) (1 - ν)} + \frac{c}{1 - d} \cdot n .

(87)

If

0 \leq ν < ϰ = d

, then one gets the closed-form representation

{\tilde{a}}_{n} = {\tilde{a}}_{n}^{h o m} + {\tilde{c}}_{n} with {\tilde{a}}_{n}^{h o m} = c \cdot \frac{1 - d^{n}}{1 - d} and {\tilde{c}}_{n} = K_{1} \cdot n \cdot d^{n - 1} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν},

(88)

which leads for all

n \in N

to

\sum_{k = 1}^{n} {\tilde{a}}_{k} = (\frac{K_{1}}{d (1 - d)} + \frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \cdot \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{2} \cdot ν \cdot (1 - ν^{n})}{(d - ν) (1 - ν)} + (\frac{c}{1 - d} - \frac{K_{1} \cdot d^{n}}{1 - d}) \cdot n .

(89)

Lemma 2 will be proved in Appendix A.3. Notice that (88) is consistent with taking the limit

ϰ ↗ d

in (86). Furthermore, for the special case

K_{2} = - K_{1} > 0

one has from (85) for all integers

n \geq 2

the relation

ρ_{n - 1} < 0

and thus

{\tilde{a}}_{n} - {\tilde{a}}_{n}^{h o m} < 0

, leading to

{\tilde{c}}_{n} < 0 and \sum_{k = 1}^{n} {\tilde{c}}_{n} < 0 .

(90)

Lemma 2 gives explicit expressions for a linear inhomogeneous recursion of the form (83) possessing the extra term given by (85). Therefrom we derive lower and upper bounds for the sequence

{(a_{n}^{(q)})}_{n \in N}

by employing

a_{n}^{(q), T}

resp.

a_{n}^{(q), S}

as the homogeneous solution of (83), i.e., by setting

{\tilde{a}}_{n}^{h o m} : = a_{n}^{(q), T}

resp.

{\tilde{a}}_{n}^{h o m} : = a_{n}^{(q), S}

. Moreover, our concrete approximation-error-reducing “correction terms”

ρ_{n}

will have different form, depending on whether

0 < q < β_{λ}

or

q > max {0, β_{λ}}

. In both cases, we express

ρ_{n}

by means of the slopes

d^{(q), T} = q e^{x_{0}^{(q)}}

resp.

d^{(q), S} = \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}}

of the tangent line

ξ_{λ}^{(q), T} (\cdot)

(cf. (76)) resp. the secant line

ξ_{λ}^{(q), S} (\cdot)

(cf. (77)), as well as in terms of the parameters

Γ_{<}^{(q)} : = \frac{1}{2} \cdot {(x_{0}^{(q)})}^{2} \cdot q \cdot e^{x_{0}^{(q)}}, for 0 < q < β_{λ}, and Γ_{>}^{(q)} : = \frac{q}{2} \cdot {(x_{0}^{(q)})}^{2}, for q > max {0, β_{λ}} .

(91)

In detail, let us first define the lower approximate by

{\underset{̲}{a}}_{0}^{(q)} : = 0, {\underset{̲}{a}}_{n}^{(q)} : = ξ_{λ}^{(q), T} ({\underset{̲}{a}}_{n - 1}^{(q)}) + {\underset{̲}{ρ}}_{n - 1}^{(q)}, n \in N,

(92)

where

{\underset{̲}{ρ}}_{n - 1}^{(q)} : = \{\begin{matrix} Γ_{<}^{(q)} \cdot {(d^{(q), T})}^{2 (n - 1)}, & if 0 < q < β_{λ}, \\ Γ_{>}^{(q)} \cdot {(d^{(q), S})}^{2 (n - 1)}, & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}} . \end{matrix}

(93)

The upper approximate is defined by

{\bar{a}}_{0}^{(q)} : = 0, {\bar{a}}_{n}^{(q)} : = ξ_{λ}^{(q), S} ({\bar{a}}_{n - 1}^{(q)}) + {\bar{ρ}}_{n - 1}^{(q)}, n \in N,

(94)

where

{\bar{ρ}}_{n - 1}^{(q)} : = \{\begin{matrix} - Γ_{<}^{(q)} \cdot {(d^{(q), T})}^{n - 1} \cdot [1 - {(d^{(q), S})}^{n - 1}], & if 0 < q < β_{λ}, \\ - Γ_{>}^{(q)} \cdot {(d^{(q), S})}^{n - 1} \cdot [1 - {(d^{(q), T})}^{n - 1}], & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}} . \end{matrix}

(95)

In terms of (85), we use for

{\underset{̲}{ρ}}_{n}^{(q)}

the constants

K_{2} = ν = 0

as well as

K_{1} = Γ_{<}^{(q)}, ϰ = {(d^{(q), T})}^{2}

for

0 < q < β_{λ}

respectively

K_{1} = Γ_{>}^{(q)}, ϰ = {(d^{(q), S})}^{2}

for

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

. For

{\bar{ρ}}_{n}^{(q)}

we shall employ the constants

- K_{1} = K_{2} = Γ_{<}^{(q)}, ϰ = d^{(q), T}, ν = d^{(q), S} d^{(q), T}

for

0 < q < β_{λ}

, and

- K_{1} = K_{2} = Γ_{>}^{(q)}, ϰ = d^{(q), S}, ν = d^{(q), S} d^{(q), T}

for

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

. Recall from (76) the constants

c^{(q), T} : = x_{0}^{(q)} (1 - q e^{x_{0}^{(q)}})

,

d^{(q), T} : = q e^{x_{0}^{(q)}}

and from (77)

c^{(q), S} : = q - β_{λ}

,

d^{(q), S} : = \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}}

. In the following, we will refer to the sequences

{\underset{̲}{a}}_{n}^{(q)}

resp.

{\bar{a}}_{n}^{(q)}

as the improved closed-form sequence-bounds. Putting all ingredients together, we arrive at the

Lemma 3.

For all

(β_{A}, β_{H}, α_{A}, α_{H}) \in P

there holds with

d^{(q), T} = q e^{x_{0}^{(q)}}

and

d^{(q), S} = \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}}

(a)

in the case

0 < q < β_{λ}

:

(i): ${\underset{̲}{a}}_{n}^{(q)} < a_{n}^{(q)} \leq {\bar{a}}_{n}^{(q)} for all n \in N,$

with equality on the right-hand side iff $n = 1$ , where

$\begin{matrix} {\underset{̲}{a}}_{n}^{(q)} = x_{0}^{(q)} \cdot (1 - {(d^{(q), T})}^{n}) + Γ_{<}^{(q)} \cdot \frac{{(d^{(q), T})}^{n - 1}}{1 - d^{(q), T}} \cdot (1 - {(d^{(q), T})}^{n}) > a_{n}^{(q), T}, and \\ {\bar{a}}_{n}^{(q)} = x_{0}^{(q)} \cdot (1 - {(d^{(q), S})}^{n}) - Γ_{<}^{(q)} \cdot [\frac{{(d^{(q), S})}^{n} - {(d^{(q), T})}^{n}}{d^{(q), S} - d^{(q), T}} - {(d^{(q), S})}^{n - 1} \frac{1 - {(d^{(q), T})}^{n}}{1 - d^{(q), T}}] \leq a_{n}^{(q), S}, \end{matrix}$

with $a_{n}^{(q), T}$ and $a_{n}^{(q), S}$ defined by (78) and (79).
(ii): Both sequences ${({\underset{̲}{a}}_{n}^{(q)})}_{n \in N}$ and ${({\bar{a}}_{n}^{(q)})}_{n \in N}$ are strictly decreasing.
(iii): $lim_{n \to \infty} {\underset{̲}{a}}_{n}^{(q)} = lim_{n \to \infty} {\bar{a}}_{n}^{(q)} = lim_{n \to \infty} a_{n}^{(q)} = x_{0}^{(q)} \in] - β_{λ}, q - β_{λ} [.$

(b)

in the case

max {0, β_{λ}} < q < min \{1, e^{β_{λ} - 1}\}

:

(i): ${\underset{̲}{a}}_{n}^{(q)} < a_{n}^{(q)} \leq {\bar{a}}_{n}^{(q)}, for all n \in N,$

with equality on the right-hand side iff $n = 1$ , where

$\begin{matrix} {\underset{̲}{a}}_{n}^{(q)} = x_{0}^{(q)} \cdot (1 - {(d^{(q), T})}^{n}) + Γ_{>}^{(q)} \cdot \frac{{(d^{(q), T})}^{n} - {(d^{(q), S})}^{2 n}}{d^{(q), T} - {(d^{(q), S})}^{2}} > a_{n}^{(q), T} and \\ {\bar{a}}_{n}^{(q)} = x_{0}^{(q)} \cdot (1 - {(d^{(q), S})}^{n}) - Γ_{>}^{(q)} \cdot {(d^{(q), S})}^{n - 1} [n - \frac{1 - {(d^{(q), T})}^{n}}{1 - d^{(q), T}}] \leq a_{n}^{(q), S}, \end{matrix}$

with $a_{n}^{(q), T}$ and $a_{n}^{(q), S}$ defined by (78) and (79).
(ii): Both sequences ${({\underset{̲}{a}}_{n}^{(q)})}_{n \in N}$ and ${({\bar{a}}_{n}^{(q)})}_{n \in N}$ are strictly increasing.
(iii): $lim_{n \to \infty} {\underset{̲}{a}}_{n}^{(q)} = lim_{n \to \infty} {\bar{a}}_{n}^{(q)} = lim_{n \to \infty} a_{n}^{(q)} = x_{0}^{(q)} \in] q - β_{λ}, - log (q) [.$

A detailed proof of Lemma 3 is provided in Appendix A.3. In the following, we employ the above-mentioned investigations in order to derive the desired closed-form bounds of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

.

6.2. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$

Recall that in this setup, we have obtained the recursive, non-explicit exact values

V_{λ, X_{0}, n} = H_{λ} (P_{A, n} ∥ P_{H, n})

given in (39) of Theorem 1, where we used

q = q_{λ}^{E} = q^{E} (β_{A}, β_{H}, λ) = β_{A}^{λ} β_{H}^{1 - λ} \in] 0, β_{λ} [

in the case

λ \in] 0, 1 [

respectively

q = q_{λ}^{E} = β_{A}^{λ} β_{H}^{1 - λ} > max {0, β_{λ}}

in the case

λ \in R \ [0, 1]

. For the latter, Lemma 1 implies that

q_{λ}^{E} < min {1, e^{β_{λ} - 1}}

iff

λ \in] λ_{-}, λ_{+} [\ [0, 1]

. This—together with (39) from Theorem 1, Lemma 2 and with the quantities

d^{(q), T}, d^{(q), S}

,

Γ_{<}^{(q)}

and

Γ_{>}^{(q)}

as defined in (76) and (77) resp. (91) –leads to

Theorem 5.

Let

p_{λ}^{E} : = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

. For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (] λ_{-}, λ_{+} [\ {0, 1})

, all initial population sizes

X_{0} \in N

and for all observation horizons

n \in N

the following assertions hold true:

(a): the Hellinger integral can be bounded by the closed-form lower and upper bounds

$C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), T} \leq C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), L} \leq V_{λ, X_{0}, n} = H_{λ} (P_{A, n} ∥ P_{H, n}) \leq C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), U} \leq C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), S},$
(b): $\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (V_{λ, X_{0}, n}) & = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), L}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), U}) \\ = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), T}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), S}) = \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})}, \end{matrix}$

where the involved closed-form lower bounds are defined by

\begin{matrix} C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), L} & : = & C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), T} \cdot exp \{{\underset{̲}{ζ}}_{n}^{(q_{λ}^{E})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \cdot {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{E})}\}, with \\ C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), T} & : = & exp \{x_{0}^{(q_{λ}^{E})} \cdot [X_{0} - \frac{α_{A}}{β_{A}} \cdot \frac{d^{(q_{λ}^{E}), T}}{1 - d^{(q_{λ}^{E}), T}}] \cdot (1 - {(d^{(q_{λ}^{E}), T})}^{n}) + \frac{α_{A}}{β_{A}} x_{0}^{(q_{λ}^{E})} \cdot n\}, \end{matrix}

(96)

and the closed-form upper bounds are defined by

\begin{matrix} C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), U} & : = & C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), S} \cdot exp \{- {\bar{ζ}}_{n}^{(q_{λ}^{E})} \cdot X_{0} - \frac{α_{A}}{β_{A}} \cdot {\bar{ϑ}}_{n}^{(q_{λ}^{E})}\}, with \\ C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), S} & : = & exp \{x_{0}^{(q_{λ}^{E})} \cdot [X_{0} - \frac{α_{A}}{β_{A}} \cdot \frac{d^{(q_{λ}^{E}), S}}{1 - d^{(q_{λ}^{E}), S}}] \cdot (1 - {(d^{(q_{λ}^{E}), S})}^{n}) + \frac{α_{A}}{β_{A}} x_{0}^{(q_{λ}^{E})} \cdot n\}, \end{matrix}

(97)

where in the case

λ \in] 0, 1 [

\begin{matrix} {\underset{̲}{ζ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{<}^{(q_{λ}^{E})} \cdot \frac{{(d^{(q_{λ}^{E}), T})}^{n - 1}}{1 - d^{(q_{λ}^{E}), T}} \cdot (1 - {(d^{(q_{λ}^{E}), T})}^{n}) > 0, \end{matrix}

(98)

\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{<}^{(q_{λ}^{E})} \cdot \frac{1 - {(d^{(q_{λ}^{E}), T})}^{n}}{{(1 - d^{(q_{λ}^{E}), T})}^{2}} \cdot [1 - \frac{d^{(q_{λ}^{E}), T} (1 + {(d^{(q_{λ}^{E}), T})}^{n})}{1 + d^{(q_{λ}^{E}), T}}] > 0, \end{matrix}

(99)

\begin{matrix} {\bar{ζ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{<}^{(q_{λ}^{E})} \cdot [\frac{{(d^{(q_{λ}^{E}), S})}^{n} - {(d^{(q_{λ}^{E}), T})}^{n}}{d^{(q_{λ}^{E}), S} - d^{(q_{λ}^{E}), T}} - {(d^{(q_{λ}^{E}), S})}^{n - 1} \cdot \frac{1 - {(d^{(q_{λ}^{E}), T})}^{n}}{1 - d^{(q_{λ}^{E}), T}}] > 0, \end{matrix}

(100)

\begin{matrix} {\bar{ϑ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{<}^{(q_{λ}^{E})} \cdot \frac{d^{(q_{λ}^{E}), T}}{1 - d^{(q_{λ}^{E}), T}} \cdot [\frac{1 - {(d^{(q_{λ}^{E}), S} d^{(q_{λ}^{E}), T})}^{n}}{1 - d^{(q_{λ}^{E}), S} d^{(q_{λ}^{E}), T}} - \frac{{(d^{(q_{λ}^{E}), S})}^{n} - {(d^{(q_{λ}^{E}), T})}^{n}}{d^{(q_{λ}^{E}), S} - d^{(q_{λ}^{E}), T}}] > 0, \end{matrix}

(101)

and where in the case

λ \in] λ_{-}, λ_{+} [\ [0, 1]

\begin{matrix} {\underset{̲}{ζ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{>}^{(q_{λ}^{E})} \cdot \frac{{(d^{(q_{λ}^{E}), T})}^{n} - {(d^{(q_{λ}^{E}), S})}^{2 n}}{d^{(q_{λ}^{E}), T} - {(d^{(q_{λ}^{E}), S})}^{2}} > 0, \end{matrix}

(102)

\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{E})} & : = & \frac{Γ_{>}^{(q_{λ}^{E})}}{d^{(q_{λ}^{E}), T} - {(d^{(q_{λ}^{E}), S})}^{2}} [\frac{d^{(q_{λ}^{E}), T} (1 - {(d^{(q_{λ}^{E}), T})}^{n})}{1 - d^{(q_{λ}^{E}), T}} - \frac{{(d^{(q_{λ}^{E}), S})}^{2} (1 - {(d^{(q_{λ}^{E}), S})}^{2 n})}{1 - {(d^{(q_{λ}^{E}), S})}^{2}}] \\ > & 0, \end{matrix}

(103)

\begin{matrix} {\bar{ζ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{>}^{(q_{λ}^{E})} \cdot {(d^{(q_{λ}^{E}), S})}^{n - 1} \cdot [n - \frac{1 - {(d^{(q_{λ}^{E}), T})}^{n}}{1 - d^{(q_{λ}^{E}), T}}] > 0, \end{matrix}

(104)

\begin{matrix} {\bar{ϑ}}_{n}^{(q_{λ}^{E})} & : = & Γ_{>}^{(q_{λ}^{E})} \cdot [\frac{d^{(q_{λ}^{E}), S} - d^{(q_{λ}^{E}), T}}{{(1 - d^{(q_{λ}^{E}), S})}^{2} (1 - d^{(q_{λ}^{E}), T})} \cdot (1 - {(d^{(q_{λ}^{E}), S})}^{n}) \\ + \frac{d^{(q_{λ}^{E}), T} (1 - {(d^{(q_{λ}^{E}), S} d^{(q_{λ}^{E}), T})}^{n})}{(1 - d^{(q_{λ}^{E}), T}) (1 - d^{(q_{λ}^{E}), S} d^{(q_{λ}^{E}), T})} - \frac{{(d^{(q_{λ}^{E}), S})}^{n}}{1 - d^{(q_{λ}^{E}), S}} \cdot n] > 0 . \end{matrix}

(105)

Notice that

\frac{α_{A}}{β_{A}}

can be equivalently be replaced by

\frac{α_{H}}{β_{H}}

in (96) and in (97).

A proof of Theorem 5 is given in Appendix A.3.

6.3. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

To derive (explicit) closed-form lower bounds of the (nonexplicit) recursive lower bounds

B_{λ, X_{0}, n}^{L}

for the Hellinger integral

H_{λ} (P_{A, n} ∥ P_{H, n})

respectively closed-form upper bounds of the recursive upper bounds

B_{λ, X_{0}, n}^{U}

for all parameters cases

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ {0, 1})

, we combine part (b) of Theorem 1, Lemma 2, Lemma 3 together with appropriate parameters

p_{λ}^{L} = p^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ), p_{λ}^{U} = p^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ) \geq 0

and

q_{λ}^{L} = q^{L} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

,

q_{λ}^{U} = q^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ) > 0

satisfying (35). Notice that the representations of the lower and upper closed-form sequence-bounds depend on whether

0 < q_{λ}^{A} < β_{λ}

,

0 < q_{λ}^{A} = β_{λ}

or

max {0, β_{λ}} < q_{λ}^{A} < min {1, e^{β_{λ} - 1}}

(

A \in {L, U})

.

Let us start with closed-form lower bounds for the case

λ \in] 0, 1 [

; recall that the choice

p_{λ}^{L} = α_{A}^{λ} α_{H}^{1 - λ}, q_{λ}^{L} = β_{A}^{λ} β_{H}^{1 - λ}

led to the optimal recursive lower bounds

B_{λ, X_{0}, n}^{L}

of the Hellinger integral (cf. Theorem 1(b) and Section 3.5). Correspondingly, we can derive

Theorem 6.

Let

p_{λ}^{L} = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{L} = β_{A}^{λ} β_{H}^{1 - λ}

. Then, the following assertions hold true:

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b} \cup P_{SP, 3 c}) \times] 0, 1 [$ (for which particularly $0 < q_{λ}^{L} < β_{λ}$ , $β_{A} \neq β_{H}$ ), all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ there holds

$\begin{matrix} C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} & \leq & C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} \leq B_{λ, X_{0}, n}^{L} < 1, \end{matrix}$

$\begin{matrix} where C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} & : = & C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} \cdot exp \{{\underset{̲}{ζ}}_{n}^{(q_{λ}^{L})} \cdot X_{0} + \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{L})}\} \end{matrix}$

(106)

$\begin{matrix} with C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} & : = & exp {x_{0}^{(q_{λ}^{L})} \cdot [X_{0} - \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot \frac{d^{(q_{λ}^{L}), T}}{1 - d^{(q_{λ}^{L}), T}}] \cdot (1 - {(d^{(q_{λ}^{L}), T})}^{n}) \\ + (\frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{L})}) - α_{λ}) \cdot n}, \\ and with {\underset{̲}{ζ}}_{n}^{(q_{λ}^{L})} & : = & Γ_{<}^{(q_{λ}^{L})} \cdot \frac{{(d^{(q_{λ}^{L}), T})}^{n - 1}}{1 - d^{(q_{λ}^{L}), T}} \cdot (1 - {(d^{(q_{λ}^{L}), T})}^{n}) > 0, \end{matrix}$

(107)

$\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{L})} & : = & Γ_{<}^{(q_{λ}^{L})} \cdot \frac{1 - {(d^{(q_{λ}^{L}), T})}^{n}}{{(1 - d^{(q_{λ}^{L}), T})}^{2}} \cdot [1 - \frac{d^{(q_{λ}^{L}), T} (1 + {(d^{(q_{λ}^{L}), T})}^{n})}{1 + d^{(q_{λ}^{L}), T}}] > 0 . \end{matrix}$

(108)
(b): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 4 a} \cup P_{SP, 4 b}) \times] 0, 1 [$ (for which particularly $0 < q_{λ}^{L} = β_{λ}$ , $β_{A} = β_{H}$ ), all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ there holds

$C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} : = C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} : = B_{λ, X_{0}, n}^{L} = exp \{(p_{λ}^{L} - α_{λ}) \cdot n\} < 1 .$
(c): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$ and all initial population sizes $X_{0} \in N$ one gets

$\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T}) & = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L}) = lim_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{L}) \\ = & \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{L})}) - α_{λ} < 0, \end{matrix}$

where in the case $β_{A} = β_{H}$ there holds $q_{λ}^{L} = β_{λ}$ and $x_{0}^{(q_{λ}^{L})} = 0$ .

The proof will be provided in Appendix A.3.

In order to deduce closed-form upper bounds for the case

λ \in] 0, 1 [

, we first recall from the Section 3.6, Section 3.7, Section 3.8, Section 3.9, Section 3.10, Section 3.11, Section 3.12 and Section 3.13, that we have to employ suitable parameters

p_{λ}^{U} = p^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ), q_{λ}^{U} = q^{U} (β_{A}, β_{H}, α_{A}, α_{H}, λ)

satisfying (35). Notice that we automatically obtain

p_{λ}^{U} \geq p_{λ}^{L} = α_{A}^{λ} α_{H}^{1 - λ} > 0

. Correspondingly, we obtain

Theorem 7.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [

, all coefficients

p_{λ}^{U}, q_{λ}^{U}

which satisfy (35) for all

x \in N_{0}

and additionally either

0 < q_{λ}^{U} \leq β_{λ}

or

β_{λ} < q_{λ}^{U} < min {1, e^{β_{λ} - 1}}

, all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

the following assertions hold true:

C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} \geq C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} \geq {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} \geq B_{λ, X_{0}, n}^{U}, where

(109)

(a): in the case $0 < q_{λ}^{U} < β_{λ}$ one has

$\begin{matrix} C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} : = C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} \cdot exp \{- {\bar{ζ}}_{n}^{(q_{λ}^{U})} \cdot X_{0} - \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot {\bar{ϑ}}_{n}^{(q_{λ}^{U})}\} \end{matrix}$

(110)

$\begin{matrix} with C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} : = exp {x_{0}^{(q_{λ}^{U})} \cdot [X_{0} - \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot \frac{d^{(q_{λ}^{U}), S}}{1 - d^{(q_{λ}^{U}), S}}] \cdot (1 - {(d^{(q_{λ}^{U}), S})}^{n}) \\ + (\frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ}) \cdot n}, \\ {\bar{ζ}}_{n}^{(q_{λ}^{U})} : = Γ_{<}^{(q_{λ}^{U})} \cdot [\frac{{(d^{(q_{λ}^{U}), S})}^{n} - {(d^{(q_{λ}^{U}), T})}^{n}}{d^{(q_{λ}^{U}), S} - d^{(q_{λ}^{U}), T}} - {(d^{(q_{λ}^{U}), S})}^{n - 1} \cdot \frac{1 - {(d^{(q_{λ}^{U}), T})}^{n}}{1 - d^{(q_{λ}^{U}), T}}] > 0, \end{matrix}$

(111)

$\begin{matrix} {\bar{ϑ}}_{n}^{(q_{λ}^{U})} : = Γ_{<}^{(q_{λ}^{U})} \cdot \frac{d^{(q_{λ}^{U}), T}}{1 - d^{(q_{λ}^{U}), T}} \cdot [\frac{1 - {(d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T})}^{n}}{1 - d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T}} - \frac{{(d^{(q_{λ}^{U}), S})}^{n} - {(d^{(q_{λ}^{U}), T})}^{n}}{d^{(q_{λ}^{U}), S} - d^{(q_{λ}^{U}), T}}] > 0; \end{matrix}$

(112)

furthermore, whenever $p_{λ}^{U}, q_{λ}^{U}$ satisfy additionally (47) (such parameters exist particularly in the setups $P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b}$ , cf. Section 3.7, Section 3.8 and Section 3.9), then

$1 > C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} and {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} = B_{λ, X_{0}, n}^{U} \forall n \in N;$
(b): in the case $0 < q_{λ}^{U} = β_{λ}$ one has

$C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} : = C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} : = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})} = exp \{(p_{λ}^{U} - α_{λ}) \cdot n\};$
(c): in the case $β_{λ} < q_{λ}^{U} < min \{1, e^{β_{λ} - 1}\}$ the formulas (109) and (110) remain valid, but with

$\begin{matrix} {\bar{ζ}}_{n}^{(q_{λ}^{U})} & : = & Γ_{>}^{(q_{λ}^{U})} \cdot {(d^{(q_{λ}^{U}), S})}^{n - 1} \cdot [n - \frac{1 - {(d^{(q_{λ}^{U}), T})}^{n}}{1 - d^{(q_{λ}^{U}), T}}] > 0, \end{matrix}$

(113)

$\begin{matrix} {\bar{ϑ}}_{n}^{(q_{λ}^{U})} & : = & Γ_{>}^{(q_{λ}^{U})} \cdot [\frac{d^{(q_{λ}^{U}), S} - d^{(q_{λ}^{U}), T}}{{(1 - d^{(q_{λ}^{U}), S})}^{2} (1 - d^{(q_{λ}^{U}), T})} \cdot (1 - {(d^{(q_{λ}^{U}), S})}^{n}) \\ + \frac{d^{(q_{λ}^{U}), T} (1 - {(d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T})}^{n})}{(1 - d^{(q_{λ}^{U}), T}) (1 - d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T})} - \frac{{(d^{(q_{λ}^{U}), S})}^{n}}{1 - d^{(q_{λ}^{U}), S}} \cdot n] > 0; \end{matrix}$

(114)
(d): for all cases (a) to (c) one gets

$\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S}) & = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U}) = lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U})}) \\ = & \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ}, \end{matrix}$

where in the case $q_{λ}^{U} = β_{λ}$ there holds $x_{0}^{(q_{λ}^{U})} = 0$ .

This Theorem 7 will be proved in Appendix A.3. Notice that for an inadequate choice of

p_{λ}^{U}, q_{λ}^{U}

it may hold that

\frac{p_{λ}^{U}}{q_{λ}^{U}} (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ} > 0

in part (d) of Theorem 7.

6.4. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

For

λ \in R \ [0, 1]

, let us now construct closed-form lower bounds of the recursive lower bound components

{\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})}

, for suitable parameters

p_{λ}^{L} \geq 0

and either

0 < q_{λ}^{L} \leq β_{λ}

or

max {0, β_{λ}} < q_{λ}^{L} < min {1, e^{β_{λ} - 1}}

satisfying (35).

Theorem 8.

For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])

, all coefficients

p_{λ}^{L} \geq 0, q_{λ}^{L} > 0

which satisfy (35) for all

x \in N_{0}

and either

0 < q_{λ}^{L} \leq β_{λ}

or

max {0, β_{λ}} < q_{λ}^{L} < min {1, e^{β_{λ} - 1}}

, all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

the following assertions hold true:

C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} \leq C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} \leq {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} \leq B_{λ, X_{0}, n}^{L}, where

(115)

(a): in the case $0 < q_{λ}^{L} < β_{λ}$ one has

$\begin{matrix} C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} : = C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} \cdot exp \{{\underset{̲}{ζ}}_{n}^{(q_{λ}^{L})} \cdot X_{0} + \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{L})}\}, \end{matrix}$

(116)

$\begin{matrix} with C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} : = exp {x_{0}^{(q_{λ}^{L})} \cdot [X_{0} - \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot \frac{d^{(q_{λ}^{L}), T}}{1 - d^{(q_{λ}^{L}), T}}] \cdot (1 - {(d^{(q_{λ}^{L}), T})}^{n}) \\ + (\frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{L})}) - α_{λ}) \cdot n} \\ {\underset{̲}{ζ}}_{n}^{(q_{λ}^{L})} : = Γ_{<}^{(q_{λ}^{L})} \cdot \frac{{(d^{(q_{λ}^{L}), T})}^{n - 1}}{1 - d^{(q_{λ}^{L}), T}} \cdot (1 - {(d^{(q_{λ}^{L}), T})}^{n}) > 0, \end{matrix}$

(117)

$\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q_{λ}^{L})} : = Γ_{<}^{(q_{λ}^{L})} \cdot \frac{1 - {(d^{(q_{λ}^{L}), T})}^{n}}{{(1 - d^{(q_{λ}^{L}), T})}^{2}} \cdot [1 - \frac{d^{(q_{λ}^{L}), T} (1 + {(d^{(q_{λ}^{L}), T})}^{n})}{1 + d^{(q_{λ}^{L}), T}}] > 0; \end{matrix}$

(118)

furthermore, whenever $p_{λ}^{L}, q_{λ}^{L}$ satisfy additionally (56) (such parameters exist particularly in the setups $P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b}$ , cf. Section 3.17, Section 3.18 and Section 3.19), then

$1 < C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} and {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} = B_{λ, X_{0}, n}^{L} \forall n \in N;$
(b): in the case $0 < q_{λ}^{L} = β_{λ}$ one has

$C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L} : = C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T} = {\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})} = exp \{(p_{λ}^{L} - α_{λ}) \cdot n\};$
(c): in the case $max {0, β_{λ}} < q_{λ}^{L} < min \{1, e^{β_{λ} - 1}\}$ the formulas (115) and (116) remain valid, but with

${\underset{̲}{ζ}}_{n}^{(q_{λ}^{L})} : = Γ_{>}^{(q_{λ}^{L})} \cdot \frac{{(d^{(q_{λ}^{L}), T})}^{n} - {(d^{(q_{λ}^{L}), S})}^{2 n}}{d^{(q_{λ}^{L}), T} - {(d^{(q_{λ}^{L}), S})}^{2}} > 0,$

(119)

${\underset{̲}{ϑ}}_{n}^{(q_{λ}^{L})} : = \frac{Γ_{>}^{(q_{λ}^{L})}}{d^{(q_{λ}^{L}), T} - {(d^{(q_{λ}^{L}), S})}^{2}} \cdot [\frac{d^{(q_{λ}^{L}), T} \cdot (1 - {(d^{(q_{λ}^{L}), T})}^{n})}{1 - d^{(q_{λ}^{L}), T}} - \frac{{(d^{(q_{λ}^{L}), S})}^{2} \cdot (1 - {(d^{(q_{λ}^{L}), S})}^{2 n})}{1 - {(d^{(q_{λ}^{L}), S})}^{2}}] > 0;$

(120)
(d): for all cases (a) to (c) one gets

$\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), T}) & = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L}), L}) = lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p_{λ}^{L}, q_{λ}^{L})}) \\ = & \frac{p_{λ}^{L}}{q_{λ}^{L}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{L})}) - α_{λ}, \end{matrix}$

where in the case $q_{λ}^{L} = β_{λ}$ there holds $x_{0}^{(q_{λ}^{L})} = 0$ .

For the proof of Theorem 8, see Appendix A.3. Notice that for an inadequate choice of

p_{λ}^{L}, q_{λ}^{L}

it may hold that

\frac{p_{λ}^{L}}{q_{λ}^{L}} (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ} < 0

in the last assertion of Theorem 8.

To derive closed-form upper bounds of the recursive upper bounds

B_{λ, X_{0}, n}^{U}

of the Hellinger integral in the case

λ \in R \ [0, 1]

, let us first recall from Section 3.24 that we have to use the parameters

p_{λ}^{U} = α_{A}^{λ} α_{H}^{1 - λ} > 0

and

q_{λ}^{U} = β_{A}^{λ} β_{H}^{1 - λ} > 0

. Furthermore, in the case

β_{A} \neq β_{H}

we obtain from Lemma 1 (setting

q_{λ} = q_{λ}^{U}

) the assertion that

max {0, β_{λ}} < q_{λ}^{U} < min {1, e^{β_{λ} - 1}}

iff

λ \in] λ_{-}, λ_{+} [\ [0, 1]

(implying that the sequence

{(a_{n}^{(q_{λ}^{U})})}_{n \in N}

converges). In the case

β_{A} = β_{H}

on gets

q_{λ}^{U} = β_{A}^{λ} β_{H}^{1 - λ} = β_{A} = β_{H} = β_{λ}

and therefore (cf. (P2))

a_{n}^{(q_{λ}^{U})} = 0

for all

n \in N

and for all

λ \in R \ [0, 1]

. Correspondingly, we deduce

Theorem 9.

Let

p_{λ}^{U} = α_{A}^{λ} α_{H}^{1 - λ}

and

q_{λ}^{U} = β_{A}^{λ} β_{H}^{1 - λ}

. Then, the following assertions hold true:

(a): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 2} \cup P_{SP, 3 a} \cup P_{SP, 3 b} \cup P_{SP, 3 c}) \times (] λ_{-}, λ_{+} [\ [0, 1])$ (in particular for $β_{A} \neq β_{H}$ ), all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ there holds

$\infty > C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} \geq C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} \geq B_{λ, X_{0}, n}^{U} > 1,$

$\begin{matrix} where & C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} : = C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} \cdot exp \{- {\bar{ζ}}_{n}^{(q_{λ}^{U})} \cdot X_{0} - \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot {\bar{ϑ}}_{n}^{(q_{λ}^{U})}\} \end{matrix}$

(121)

$\begin{matrix} with & C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} : = exp {x_{0}^{(q_{λ}^{U})} \cdot [X_{0} - \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot \frac{d^{(q_{λ}^{U}), T}}{1 - d^{(q_{λ}^{U}), T}}] \cdot (1 - {(d^{(q_{λ}^{U}), T})}^{n}) \\ + (\frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ}) \cdot n}, \\ {\bar{ζ}}_{n}^{(q_{λ}^{U})} : = Γ_{>}^{(q_{λ}^{U})} \cdot {(d^{(q_{λ}^{U}), S})}^{n - 1} \cdot [n - \frac{1 - {(d^{(q_{λ}^{U}), T})}^{n}}{1 - d^{(q_{λ}^{U}), T}}] > 0, \end{matrix}$

(122)

$\begin{matrix} {\bar{ϑ}}_{n}^{(q_{λ}^{U})} : = Γ_{>}^{(q_{λ}^{U})} \cdot [\frac{d^{(q_{λ}^{U}), S} - d^{(q_{λ}^{U}), T}}{{(1 - d^{(q_{λ}^{U}), S})}^{2} (1 - d^{(q_{λ}^{U}), T})} \cdot (1 - {(d^{(q_{λ}^{U}), S})}^{n}) \\ + \frac{d^{(q_{λ}^{U}), T} (1 - {(d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T})}^{n})}{(1 - d^{(q_{λ}^{U}), T}) (1 - d^{(q_{λ}^{U}), S} d^{(q_{λ}^{U}), T})} - \frac{{(d^{(q_{λ}^{U}), S})}^{n}}{1 - d^{(q_{λ}^{U}), S}} \cdot n] > 0 . \end{matrix}$

(123)
(b): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP, 4 a} \cup P_{SP, 4 b}) \times (R \ [0, 1])$ (for which particularly $0 < q_{λ}^{U} = β_{λ}$ , $β_{A} = β_{H}$ ), all initial population sizes $X_{0} \in N$ and all observation horizons $n \in N$ there holds

$C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U} : = C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S} : = B_{λ, X_{0}, n}^{U} = exp \{(p_{λ}^{U} - α_{λ}) \cdot n\} > 1 .$
(c): For all $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (] λ_{-}, λ_{+} [\ [0, 1])$ and all initial population sizes $X_{0} \in N$ one gets

$\begin{matrix} lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), S}) & = & lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{U}, q_{λ}^{U}), U}) = lim_{n \to \infty} \frac{1}{n} log (B_{λ, X_{0}, n}^{U}) \\ = & \frac{p_{λ}^{U}}{q_{λ}^{U}} \cdot (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ} > 0, \end{matrix}$

where in the case $β_{A} = β_{H}$ there holds $q_{λ}^{U} = β_{λ}$ and $x_{0}^{(q_{λ}^{U})} = 0$ .

A proof of Theorem 9 is provided in Appendix A.3.

Remark 7.

Substituting

a_{n}^{(q)}

by

a_{n}^{(q), T}

resp.

a_{n}^{(q), S}

(cf. (78) resp. (79)) in

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

from (42) leads to the “rudimentary” closed-form bounds

C_{λ, X_{0}, n}^{(p, q), T}

resp.

C_{λ, X_{0}, n}^{(p, q), S}

, whereas substituting

a_{n}^{(q)}

by

{\underset{̲}{a}}_{n}^{(q)}

resp.

{\bar{a}}_{n}^{(q)}

(cf. (92) resp. (94)) in

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

from (42) leads to the “improved” closed-form bounds

C_{λ, X_{0}, n}^{(p, q), L}

resp.

C_{λ, X_{0}, n}^{(p, q), U}

in all the Theorems 5–9.

6.5. Totally Explicit Closed-Form Bounds

The above-mentioned results give closed-form lower bounds

C_{λ, X_{0}, n}^{(p, q), L}

,

C_{λ, X_{0}, n}^{(p, q), T}

resp. closed-form upper bounds

C_{λ, X_{0}, n}^{(p, q), U}

,

C_{λ, X_{0}, n}^{(p, q), S}

of the Hellinger integrals

H_{λ} (P_{A, n} ∥ P_{H, n})

for case-dependent choices of

p, q

. However, these bounds still involve the fixed point

x_{0}^{(q)}

which in general has to be calculated implicitly. In order to get “totally” explicit but “slightly” less tight closed-form bounds of

H_{λ} (P_{A, n} ∥ P_{H, n})

, one can proceed as follows:

in all the closed-form lower bound formulas of the Theorems 5, 6 and 8–including the definitions (76), (77) and (91)–replace the implicit $x_{0}^{(q)}$ by a close explicitly known point ${\underset{̲}{x}}_{0}^{(q)} < x_{0}^{(q)}$ ;
in all closed-form upper bound formulas of the Theorems 5, 7 and 9–including (76), (77) and (91)–replace $x_{0}^{(q)}$ by a close explicitly known point ${\bar{x}}_{0}^{(q)} > x_{0}^{(q)}$ .

For instance, one can use the following choices which will be also employed as an auxiliary tool for the diffusion-limit-concerning proof of Lemma A6 in Appendix A.4:

\begin{matrix} {\underset{̲}{x}}_{0}^{(q)} & : = & \{\begin{matrix} q^{- 1} \cdot e^{- {\underset{̲}{\underset{̲}{x}}}_{0}^{(q)}} \cdot [(1 - q) - \sqrt{{(1 - q)}^{2} - 2 \cdot q \cdot e^{{\underset{̲}{\underset{̲}{x}}}_{0}^{(q)}} \cdot (q - β_{λ})}], & if q \in] 0, β_{λ} [, \\ q^{- 1} \cdot [(1 - q) - \sqrt{{(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ})}], & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}, \end{matrix} \end{matrix}

(124)

\begin{matrix} where {\underset{̲}{\underset{̲}{x}}}_{0}^{(q)} : = \{\begin{matrix} max \{- β_{λ}, \frac{q - β_{λ}}{1 - q}\}, & if q \in] 0, 1 [, \\ - β_{λ}, & if q \geq 1, \end{matrix} \end{matrix}

(125)

\begin{matrix} {\bar{x}}_{0}^{(q)} & : = & \{\begin{matrix} q^{- 1} \cdot [(1 - q) - \sqrt{{(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ})}], & if q \in] 0, β_{λ} [, \\ (1 - q) - \sqrt{{(1 - q)}^{2} - 2 \cdot (q - β_{λ})}, & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}} \\ and {(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ}) \geq 0, \\ {\bar{\bar{x}}}_{0}^{(q)} : = - log (q) & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}} \\ and {(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ}) < 0 . \end{matrix} \end{matrix}

(126)

Behind this choice “lies” the idea that–in contrast to the solution

x_{0}^{(q)}

of

ξ_{λ}^{(q)} (x) : = q e^{x} - β_{λ} = x

–the point

{\underset{̲}{x}}_{0}^{(q)}

is a solution of (the obviously explicitly solvable)

{\underset{̲}{Q}}_{λ}^{(q)} (x) : = {\underset{̲}{a}}_{λ}^{(q)} x^{2} + {\underset{̲}{b}}_{λ}^{(q)} x + {\underset{̲}{c}}_{λ}^{(q)} = x

in both cases

0 < q < β_{λ}

and

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

, whereas the point

{\bar{x}}_{0}^{(q)}

is a solution of

{\bar{Q}}_{λ}^{(q)} (x) : = {\bar{a}}_{λ}^{(q)} x^{2} + {\bar{b}}_{λ}^{(q)} x + {\bar{c}}_{λ}^{(q)} = x

in the case

0 < q < β_{λ}

and in the case

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

together with

{(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ}) \geq 0

. Thereby,

{\underset{̲}{Q}}_{λ}^{(q)} (\cdot)

and

{\bar{Q}}_{λ}^{(q)} (\cdot)

are the lower resp. upper quadratic approximates of

ξ_{λ}^{(q)} (\cdot)

satisfying the following constraints:

for $q \in] 0, β_{λ} [$ (mostly but not only for $λ \in] 0, 1 [$ ) (lower bound):

${\underset{̲}{Q}}_{λ}^{(q)} (0) = ξ_{λ}^{(q)} (0) = q - β_{λ}, {\underset{̲}{Q}}_{λ}^{(q)'} (0) = ξ_{λ}^{(q)'} (0) = q, {\underset{̲}{Q}}_{λ}^{(q)''} (x) = ξ_{λ}^{(q)''} (y) = q e^{y}, x \in R,$

for some explicitly known approximate $y < x_{0}^{(q)}$ (leading to the (tighter) explicit lower approximate ${\underset{̲}{x}}_{0}^{(q)} \in] y, x_{0}^{(q)} [$ ); here, we choose

$y : = {\underset{̲}{\underset{̲}{x}}}_{0}^{(q)} : = \{\begin{matrix} max \{- β_{λ}, \frac{q - β_{λ}}{1 - q}\}, & if q < 1, \\ - β_{λ}, & if q \geq 1; \end{matrix}$
for $q \in] 0, β_{λ} [$ (mostly but not only for $λ \in] 0, 1 [$ ) (upper bound):

${\bar{Q}}_{λ}^{(q)} (0) = ξ_{λ}^{(q)} (0) = q - β_{λ}, {\bar{Q}}_{λ}^{(q)'} (0) = ξ_{λ}^{(q)'} (0) = q, {\bar{Q}}_{λ}^{(q)''} (x) = ξ_{λ}^{(q)''} (0) = q, x \in R;$
for $max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}$ (mostly but not only for $λ \in R \ [0, 1]$ ) (lower bound):

${\underset{̲}{Q}}_{λ}^{(q)} (0) = ξ_{λ}^{(q)} (0) = q - β_{λ}, {\underset{̲}{Q}}_{λ}^{(q)'} (0) = ξ_{λ}^{(q)'} (0) = q, {\underset{̲}{Q}}_{λ}^{(q)''} (x) = ξ_{λ}^{(q)''} (0) = q, x \in R;$
for $max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}$ in combination with ${(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ}) \geq 0$ (mostly but not only for $λ \in R \ [0, 1]$ ) (upper bound):

${\bar{Q}}_{λ}^{(q)} (0) = ξ_{λ}^{(q)} (0) = q - β_{λ}, {\bar{Q}}_{λ}^{(q)'} (0) = ξ_{λ}^{(q)'} (0) = q, {\bar{Q}}_{λ}^{(q)''} (x) = ξ_{λ}^{(q)''} (- log (q)) = 1, x \in R .$

If

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

and

{(1 - q)}^{2} - 2 \cdot q \cdot (q - β_{λ}) < 0

, then a real-valued solution

{\bar{Q}}_{λ}^{(q)} (x) = x

does not exist and we set

{\bar{x}}_{0}^{(q)} : = {\bar{\bar{x}}}_{0}^{(q)} : = - log (q)

, with

ξ_{λ}^{(q)'} ({\bar{\bar{x}}}_{0}^{(q)}) = 1

. The above considerations lead to corresponding unique choices of constants

{\underset{̲}{a}}_{λ}^{(q)}, {\underset{̲}{b}}_{λ}^{(q)}, {\underset{̲}{c}}_{λ}^{(q)}, {\bar{a}}_{λ}^{(q)}, {\bar{b}}_{λ}^{(q)}, {\bar{c}}_{λ}^{(q)}

culminating in

\begin{matrix} {\underset{̲}{Q}}_{λ}^{(q)} (x) & : = & \{\begin{matrix} \frac{q}{2} \cdot e^{{\underset{̲}{\underset{̲}{x}}}_{0}^{(q)}} \cdot x^{2} + q \cdot x + q - β_{λ}, & if 0 < q < β_{λ}, \\ \frac{q}{2} \cdot x^{2} + q \cdot x + q - β_{λ}, & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}, \end{matrix} \end{matrix}

(127)

\begin{matrix} {\bar{Q}}_{λ}^{(q)} (x) & : = & \{\begin{matrix} \frac{q}{2} \cdot x^{2} + q \cdot x + q - β_{λ}, & if 0 < q < β_{λ}, \\ \frac{1}{2} \cdot x^{2} + q \cdot x + q - β_{λ}, & if max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}} . \end{matrix} \end{matrix}

(128)

6.6. Closed-Form Bounds for Power Divergences of Non-Kullback-Leibler-Information-Divergence Type

Analogously to Section 4 (see especially Section 4.1), for orders

λ \in R \ {0, 1}

all the results of the previous Section 6.1, Section 6.2, Section 6.3, Section 6.4 and Section 6.5 carry correspondingly over from closed-form bounds of the Hellinger integrals

H_{λ} (\cdot ∥ \cdot)

to closed-form bounds of the total variation distance

V (\cdot | | \cdot)

, by virtue of the relation (cf. (12))

2 (1 - H_{\frac{1}{2}} (P_{A, n} ∥ P_{H, n})) \leq V (P_{A, n} ∥ P_{H, n}) \leq 2 \sqrt{1 - {(H_{\frac{1}{2}} (P_{A, n} ∥ P_{H, n}))}^{2}},

to closed-form bounds of the Renyi divergences

R_{λ} (\cdot ∥ \cdot)

, by virtue of the relation (cf. (7))

0 \leq R_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1}{λ (λ - 1)} log H_{λ} (P_{A, n} ∥ P_{H, n}), with log 0 : = - \infty,

as well as to closed-form bounds of the power divergences

I_{λ} (\cdot ∥ \cdot)

, by virtue of the relation (cf. (2))

I_{λ} (P_{A, n} ∥ P_{H, n}) = \frac{1 - H_{λ} (P_{A, n} ∥ P_{H, n})}{λ \cdot (1 - λ)}, n \in N .

For the sake of brevity, the–merely repetitive–exact details are omitted.

6.7. Applications to Decision Making

The above-mentioned investigations of the Section 6.1 to Section 6.6 can be applied to the context of Section 2.5 on dichotomous decision making on the space of all possible path scenarios (path space) of Poissonian Galton-Watson processes without (with) immigration GW(I) (e.g., in combination with our running-example epidemiological context of Section 2.3). More detailed, for the minimal mean decision loss (Bayes risk)

R_{n}

defined by (18) we can derive explicit closed-form upper (respectively lower) bounds by using (19) respectively (20) together with the results of the Section 6.1, Section 6.2, Section 6.3, Section 6.4 and Section 6.5 concerning Hellinger integrals of order

λ \in] 0, 1 [

; we can proceed analogously in the Neyman-Pearson context in order to deduce closed-form bounds of type II error probabilities, by means of (23) and (24). Moreover, in an analogous way we can employ the investigations of Section 6.6 on power divergences in order to obtain closed-form bounds of (i) the corresponding (cf. (21)) weighted-average decision risk reduction (weighted-average statistical information measure) about the degree of evidence

d e g

concerning the parameter

θ

that can be attained by observing the GW(I)-path

X_{n}

until stage n, as well as (ii) the corresponding (cf. (22)) limit decision risk reduction (limit statistical information measure). For the sake of brevity, the–merely repetitive–exact details are omitted.

7. Hellinger Integrals and Power Divergences of Galton-Watson Type Diffusion Approximations

7.1. Branching-Type Diffusion Approximations

One can show that a properly rescaled Galton-Watson process without (respectively with) immigration GW(I) converges weakly to a diffusion process

\tilde{X} : = \{{\tilde{X}}_{s}, s \in [0, \infty [\}

which is the unique, strong, nonnegative – and in case of

\frac{η}{σ^{2}} \geq \frac{1}{2}

strictly positive– solution of the stochastic differential equation (SDE) of the form

d {\tilde{X}}_{s} = (η - κ {\tilde{X}}_{s}) d s + σ \sqrt{{\tilde{X}}_{s}} d W_{s}, s \in [0, \infty [, {\tilde{X}}_{0} \in] 0, \infty [given,

(129)

where

η \in [0, \infty [

,

κ \in [0, \infty [

,

σ \in] 0, \infty [

are constants and

\{W_{s}, s \in [0, \infty [\}

denotes a standard Brownian motion with respect to the underlying probability measure P; see e.g., Feller [130], Jirina [131], Lamperti [132,133], Lindvall [134,135], Grimvall [136], Jagers [56], Borovkov [137], Ethier & Kurtz [138], Durrett [139] for the non-immigration case corresponding to

η = 0

,

κ \geq 0

, Kawazu & Watanabe [140], Wei & Winnicki [141], Winnicki [64] for the immigration case corresponding to

η \neq 0

,

κ = 0

, as well as Sriram [142] for the general case

η \in [0, \infty [

,

κ \in R

. Feller-type branching processes of the form (129), which are special cases of continuous state branching processes with immigration (see e.g., Kawazu & Watanabe [140], Li [143], as well as Dawson & Li [144] for imbeddings to affine processes) play for instance an important role in the modelling of the term structure of interest rates, cf. the seminal Cox-Ingersoll-Ross CIR model [145] and the vast follow-up literature thereof. Furthermore, (129) is also prominently used as (a special case of) Cox & Ross’s [146] constant elasticity of variance CEV asset price process, as (part of) Heston’s [147] stochastic asset-volatility framework, as a model of neuron activity (see e.g., Lansky & Lanska [148], Giorno et al. [149], Lanska et al. [150], Lansky et al [151], Ditlevsen & Lansky [152], Höpfner [153], Lansky & Ditlevsen [154]), as a time-dynamic description of the nitrous oxide emission rate from the soil surface (see e.g., Pedersen [155]), as well as a model for the individual hazard rate in a survival analysis context (see e.g., Aalen & Gjessing [156]).

Along these lines of branching-type diffusion limits, it makes sense to consider the solutions of two SDEs (129) with different fixed parameter sets

(η, κ_{A}, σ)

and

(η, κ_{H}, σ)

, determine for each of them a corresponding approximating GW(I), investigate the Hellinger integral between the laws of these two GW(I), and finally calculate the limit of the Hellinger integral (bounds) as the GW(I) approach their SDE solutions. Notice that for technicality reasons (which will be explained below), the constants

η

and

σ

ought to be independent of

A

,

H

in our current context.

In order to make the above-mentioned limit procedure rigorous, it is reasonable to work with appropriate approximations such that in each convergence step m one faces the setup

P_{NI} \cup P_{SP, 1}

(i.e., the non-immigration or the equal-fraction case), where the corresponding Hellinger integral can be calculated exactly in a recursive way, as stated in Theorem 1. Let us explain the details in the following.

Consider a sequence of GW(I)

{(X^{(m)})}_{m \in N}

with probability laws

P_{•}^{(m)}

on a measurable space

(Ω, F)

, where as above the subscript • stands for either the hypothesis

H

or the alternative

A

. Analogously to (1), we use for each fixed step

m \in N

the representation

X^{(m)} : = \{X_{ℓ}^{(m)}, ℓ \in N\}

with

X_{ℓ}^{(m)} : = \sum_{j = 1}^{X_{ℓ - 1}^{(m)}} Y_{ℓ - 1, j}^{(m)} + {\tilde{Y}}_{ℓ}^{(m)}, ℓ \in N, X_{0}^{(m)} \in N given,

(130)

where under the law

P_{•}^{(m)}

the collection $Y^{(m)} : = \{Y_{i, j}^{(m)}, i \in N_{0}, j \in N\}$ consists of i.i.d. random variables which are Poisson distributed with parameter $β_{•}^{(m)} > 0$ ,
the collection ${\tilde{Y}}^{(m)} : = \{{\tilde{Y}}_{i}^{(m)}, i \in N\}$ consists of i.i.d. random variables which are Poisson distributed with parameter $α_{•}^{(m)} \geq 0$ ,
$Y^{(m)}$ and ${\tilde{Y}}^{(m)}$ are independent.

From arbitrary drift-parameters

η \in [0, \infty [

,

κ_{•} \in [0, \infty [

, and diffusion-term-parameter

σ > 0

, we construct the offspring-distribution-parameter and the immigration-distribution parameter of the sequence

{(X_{ℓ}^{(m)})}_{ℓ \in N}

by

β_{•}^{(m)} : = 1 - \frac{κ_{•}}{σ^{2} m} and α_{•}^{(m)} : = β_{•}^{(m)} \cdot \frac{η}{σ^{2}} .

(131)

Here and henceforth, we always assume that the approximation step m is large enough to ensure that

β_{•}^{(m)} \in] 0, 1]

and at least one of

β_{A}^{(m)}

,

β_{H}^{(m)}

is strictly less than 1; this will be abbreviated by

m \in \bar{N}

. Let us point out that – as mentioned above–our choice entails the best-to-handle setup

P_{NI} \cup P_{SP, 1}

(which does not happen if instead of

η

one uses

η_{•}

with

η_{A} \neq η_{H}

). Based on the GW(I)

X^{(m)}

, let us construct the continuous-time branching process

{\tilde{X}}^{(m)} : = \{{\tilde{X}}_{s}^{(m)}, s \in [0, \infty [\}

by

{\tilde{X}}_{s}^{(m)} : = \frac{1}{m} X_{⌊σ^{2} m s⌋}^{(m)},

(132)

living on the state space

E^{(m)} : = \frac{1}{m} N_{0}

. Notice that

{\tilde{X}}^{(m)}

is constant on each time-interval

[\frac{k}{σ^{2} m}, \frac{k + 1}{σ^{2} m} [

and takes at

s = \frac{k}{σ^{2} m}

the value

\frac{1}{m} X_{k}^{(m)}

of the k-th GW(I) generation size, divided by m, i.e., it “jumps” with the jump-size

\frac{1}{m} (X_{k}^{(m)} - X_{k - 1}^{(m)})

which is equal to the

\frac{1}{m}

-fold difference to the previous generation size. From (132) one can immediately see the necessity of having

σ

to be independent of

A

,

H

because for the required law-equivalence in (the corresponding version of) (13) both models at stake have to “live” on the same time-scale

τ_{s}^{(m)} : = ⌊σ^{2} m s⌋

. For this setup, one obtains the following convergenc result:

Theorem 10.

Let

η \in [0, \infty [

,

κ_{•} \in [0, \infty [

,

σ \in] 0, \infty [

and

{\tilde{X}}^{(m)}

be as defined in (130) to (132). Furthermore, let us suppose that

{lim}_{m \to \infty} \frac{1}{m} X_{0}^{(m)} = {\tilde{X}}_{0} > 0

and denote by

d ([0, \infty [, [0, \infty [)

the space of right-continuous functions

f : [0, \infty [\mapsto [0, \infty [

with left limits. Then the sequence of processes

{({\tilde{X}}^{(m)})}_{m \in \bar{N}}

convergences in distribution in

d ([0, \infty [, [0, \infty [)

to a diffusion process

\tilde{X}

which is the unique strong, nonnegative–and in case of

\frac{η}{σ^{2}} \geq \frac{1}{2}

strictly positive–solution of the SDE

d {\tilde{X}}_{s} = (η - κ_{•} {\tilde{X}}_{s}) d s + σ \sqrt{{\tilde{X}}_{s}} d W_{s}^{•}, s \in [0, \infty [, {\tilde{X}}_{0} \in] 0, \infty [given,

(133)

where

\{W_{s}^{•}, s \in [0, \infty [\}

denotes a standard Brownian motion with respect to the limit probability measure

{\tilde{P}}_{•}

.

Remark 8.

Notice that the condition

\frac{η}{σ^{2}} \geq \frac{1}{2}

can be interpreted in our approximation setup (131) as

α_{•}^{(m)} \geq β_{•}^{(m)} / 2

, which quantifies the intuitively reasonable indication that if the probability

P_{•} [{\tilde{Y}}_{ℓ}^{(m)} = 0] = e^{- α_{•}^{(m)}}

of having no immigration is small enough relative to the probability

P_{•} [Y_{ℓ - 1, k}^{(m)} = 0] = e^{- β_{•}^{(m)}}

of having no offspring (

m \in \bar{N}

), then the limiting diffusion

\tilde{X}

never hits zero almost surely.

The corresponding proof of Theorem 10–which is outlined in Appendix A.4–is an adaption of the proof of Theorem 9.1.3 in Ethier & Kurtz [138] which deals with drift-parameters

η = 0

,

κ_{•} = 0

in the SDE (133) whose solution is approached on a

σ -

independent time scale by a sequence of (critical) Galton-Watson processes without immigration but with general offspring distribution with mean 1 and variance

σ

. Notice that due to (131) the latter is inconsistent with our Poissonian setup, but this is compensated by our chosen

σ -

dependent time scale. Other limit investigations for (133) involving offspring/immigration distributions and parametrizations which are also incompatible to ours, are e.g., treated in Sriram [142].

As illustration of our proposed approach, let us give the following

Example 3.

Consider the parameter setup

(η, κ_{•}, σ) = (5, 2, 0.4)

and initial generation size

{\tilde{X}}_{0} = 3

. Figure 4 shows the diffusion-approximation

{\tilde{X}}_{s}^{(m)}

(blue) of the corresponding solution

{\tilde{X}}_{s}

of the SDE (133) up to the time horizon

T = 10

, for the approximation steps

m \in {13, 50, 200, 1000}

. Notice that in this setup there holds

\bar{N} = {k \in N : k \geq 13}

(recall that

\bar{N}

is the subset of the positive integers such that

β_{•}^{(m)} = 1 - \frac{κ_{•}}{σ^{2} \cdot m} > 0

). The “long-term mean” of the limit process

{\tilde{X}}_{s}

is

\frac{η}{κ_{•}} = 2.5

and is indicated as red line. The “long-term mean” of the approximations

{\tilde{X}}_{s}^{(m)}

is equal to

\frac{α_{•}^{(m)}}{1 - β_{•}^{(m)}} = \frac{η}{κ_{•}} - \frac{η}{σ^{2} \cdot m} = 2.5 - 31.25 / m

and is displayed as green line.

7.2. Bounds of Hellinger Integrals for Diffusion Approximations

For each approximation step m and each observation horizon

t \in [0, \infty [

, let us now investigate the behaviour of the Hellinger integrals

H_{λ} (P_{A, t}^{(m), C d A} ∥ P_{H, t}^{(m), C d A})

, where

P_{•, t}^{(m), C d A}

denotes the canonical law (under

H

resp.

A

) of the continuous-time diffusion approximation

{\tilde{X}}^{(m)}

(cf. (132)), restricted to

[0, t]

. It is easy to see that

H_{λ} (P_{A, t}^{(m), C d A} ∥ P_{H, t}^{(m), C d A})

coincides with

H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

of the law restrictions of the GW(I) generations sizes

{(X_{ℓ}^{(m)})}_{ℓ \in {0, \dots, ⌊σ^{2} m t⌋}}

, where

\frac{⌊σ^{2} m t⌋}{σ^{2} m}

can be interpreted as the last “jump-time” of

{\tilde{X}}^{(m)}

before t. These Hellinger integrals obey the results of

the Propositions 2 and 3 (for $η = 0$ ) respectively the Propositions 4 and 5 (for $η \in] 0, \infty [$ ), as far as recursively computable exact values are concerned,
Theorem 5 as far as closed-form bounds are concerned; recall that the current setup is of type $P_{NI} \cup P_{SP, 1}$ , and thus we can use the simplifications proposed in the Remark 7(a).

In order to obtain the desired Hellinger integral limits

{lim}_{m \to \infty} H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

, one faces several technical problems which will be described in the following. To begin with, for fixed

m \in \bar{N}

we apply the Propositions 2(b), 3(b), 4(b), 5(b) to the current setup

(β_{A}^{(m)}, β_{H}^{(m)}, α_{A}^{(m)}, α_{H}^{(m)}) \in P_{NI} \cup P_{SP, 1}

with

β_{•}^{(m)} : = β_{•} (m, κ_{•}, σ^{2}) : = 1 - \frac{κ_{•}}{σ^{2} m} and α_{•}^{(m)} : = α_{•} (m, κ_{•}, σ^{2}, η) : = β_{•}^{(m)} \cdot \frac{η}{σ^{2}} (cf . (131)) .

Notice that

η = 0

corresponds to the no-immigration (NI) case and that

\frac{α_{•}^{(m)}}{β_{•}^{(m)}} = \frac{η}{σ^{2}}

. Accordingly, we set

α_{λ}^{(m)} : = λ \cdot α_{A}^{(m)} + (1 - λ) \cdot α_{H}^{(m)}, β_{λ}^{(m)} : = λ \cdot β_{A}^{(m)} + (1 - λ) \cdot β_{H}^{(m)}

. By using

q_{λ}^{(m)} : = q (m, κ_{•}, σ^{2}, λ) : = {(β_{A}^{(m)})}^{λ} {(β_{H}^{(m)})}^{1 - λ}, λ \in R \ {0, 1},

(134)

as well as the connected sequence

{(a_{n}^{(m)})}_{n \in N} : = {(a_{n}^{(q_{λ}^{(m)})})}_{n \in N}

we arrive at the

Corollary 13.

For all

(β_{A}^{(m)}, β_{H}^{(m)}, α_{A}^{(m)}, α_{H}^{(m)}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})

and all population sizes

X_{0}^{(m)} \in N

there holds

h_{λ}^{(m)} : = H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = exp \{a_{⌊σ^{2} m t⌋}^{(q_{λ}^{(m)})} \cdot X_{0}^{(m)} + \frac{η}{σ^{2}} \sum_{k = 1}^{⌊σ^{2} m t⌋} a_{k}^{(q_{λ}^{(m)})}\}

(135)

with

η = 0

in the NI case.

In the following, we employ the SDE-parameter constellations (which are consistent with (131) in combination with our requirement to work here only on

(P_{NI} \cup P_{SP, 1})

)

\begin{matrix} {\tilde{P}}_{N I} & : = & \{(κ_{A}, κ_{H}, η), η = 0, κ_{A} \in [0, \infty [, κ_{H} \in [0, \infty [, κ_{A} \neq κ_{H}\}, \end{matrix}

(136)

\begin{matrix} {\tilde{P}}_{S P, 1} & : = & \{(κ_{A}, κ_{H}, η), η > 0, κ_{A} \in [0, \infty [, κ_{H} \in [0, \infty [, κ_{A} \neq κ_{H}\} . \end{matrix}

(137)

Due to the–not in closed-form representable–recursive nature of the sequences

{(a_{n}^{(q)})}_{n \in N}

defined by (36), the calculation of

{lim}_{m \to \infty} h_{λ}^{(m)}

in (135) seems to be not (straightforwardly) tractable; after all, one “has to move along” a sequence of recursions (roughly speaking) since

⌊σ^{2} m t⌋ \to \infty

as m tends to infinity. One way to “circumvent” such technical problems is to compute instead of the limit

{lim}_{m \to \infty} h_{λ}^{(m)}

of the (exact values of the) Hellinger integrals

h_{λ}^{(m)}

, the limits of the corresponding (explicit) closed-form lower resp. upper bounds adapted from Theorem 5. In order to achieve this, one first needs a preparatory step, due to the fact that the sequence

{(a_{⌊σ^{2} m t⌋}^{(q_{λ}^{(m)})})}_{m \in \bar{N}}

(and hence its bounds leading to closed-form expressions) does not necessarily converge for all

λ \in R \ [0, 1]

; roughly, this can be conjectured from the Propositions 3(c) and 5(c) in combination with

⌊σ^{2} m t⌋ \to \infty

. Correspondingly, for our “sequence-of-recursions” context equipped with the diffusion-limit’s drift-parameter constellations

(κ_{A}, κ_{H}, η)

we have to derive a “convergence interval”

[{\tilde{λ}}_{-}, {\tilde{λ}}_{+}] \ [0, 1]

which replaces the single-recursion-concerning

[λ_{-}, λ_{+}] \ [0, 1]

(cf. Lemma 1). This amounts to

Proposition 15.

For all

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{NI} \cup {\tilde{P}}_{SP, 1}

define

0 > {\tilde{λ}}_{-} : = \{\begin{matrix} - \infty, & if κ_{A} < κ_{H}, \\ - \frac{κ_{H}^{2}}{κ_{A}^{2} - κ_{H}^{2}}, & if κ_{A} > κ_{H}, \end{matrix} and 1 < {\tilde{λ}}_{+} : = \{\begin{matrix} \frac{κ_{H}^{2}}{κ_{H}^{2} - κ_{A}^{2}}, & if κ_{A} < κ_{H}, \\ \infty, & if κ_{A} > κ_{H} . \end{matrix}

(138)

Then, for all

(κ_{A}, κ_{H}, η, λ) \in ({\tilde{P}}_{NI} \cup {\tilde{P}}_{SP, 1}) \times] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

there holds for all sufficiently large

m \in \bar{N}

q_{λ}^{(m)} : = {(1 - \frac{κ_{A}}{σ^{2} m})}^{λ} {(1 - \frac{κ_{H}}{σ^{2} m})}^{1 - λ} < min \{1, e^{β_{λ}^{(m)} - 1}\},

(139)

and thus the sequence

{(a_{n}^{(q_{λ}^{(m)})})}_{n \in N}

converges to the fixed point

x_{0}^{(m)} \in] 0, - log (q_{λ}^{(m)}) [

.

This will be proved in Appendix A.4.

We are now in the position to determine bounds of the Hellinger integral limits

{lim}_{m \to \infty} H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

in form of m-limits of appropriate versions of closed-form bounds from Section 6. For the sake of brevity, let us henceforth use the abbreviations

x_{0}^{(m)} : = x_{0}^{(q_{λ}^{(m)})}

,

Γ_{<}^{(m)} : = Γ_{<}^{(q_{λ}^{(m)})} = \frac{q_{λ}^{(m)}}{2} \cdot e^{x_{0}^{(m)}} \cdot {(x_{0}^{(m)})}^{2}

,

Γ_{>}^{(m)} : = Γ_{>}^{(q_{λ}^{(m)})} = \frac{q_{λ}^{(m)}}{2} \cdot {(x_{0}^{(m)})}^{2}

,

d^{(m), S} : = d^{(q_{λ}^{(m)}), S} = \frac{x_{0}^{(m)} - (q_{λ}^{(m)} - β_{λ}^{(m)})}{x_{0}^{(m)}}

and

d^{(m), T} : = d^{(q_{λ}^{(m)}), T} = q_{λ}^{(m)} \cdot e^{x_{0}^{(m)}} .

By the above considerations, the Theorem 5 (together with Remark 7(a)) adapts to the current setup as follows:

Corollary 14.

(a) For all

(κ_{A}, κ_{H}, η, λ) \in ({\tilde{P}}_{N I} \cup {\tilde{P}}_{S P, 1}) \times] 0, 1 [

, all

t \in [0, \infty [

, all approximation steps

m \in \bar{N}

and all initial population sizes

X_{0}^{(m)} \in N

the Hellinger integral can be bounded by

\begin{matrix} C_{λ, X_{0}^{(m)}, t}^{(m), L} & : = & exp {x_{0}^{(m)} \cdot [X_{0}^{(m)} - \frac{η}{σ^{2}} \frac{d^{(m), T}}{1 - d^{(m), T}}] (1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}) + x_{0}^{(m)} \frac{η}{σ^{2}} \cdot ⌊σ^{2} m t⌋ \\ + {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot X_{0}^{(m)} + \frac{η}{σ^{2}} \cdot {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} \end{matrix}

(140)

\begin{matrix} \leq & H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) \\ \leq & exp {x_{0}^{(m)} \cdot [X_{0}^{(m)} - \frac{η}{σ^{2}} \frac{d^{(m), S}}{1 - d^{(m), S}}] (1 - {(d^{(m), S})}^{⌊σ^{2} m t⌋}) + x_{0}^{(m)} \frac{η}{σ^{2}} \cdot ⌊σ^{2} m t⌋ \\ - {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot X_{0}^{(m)} - \frac{η}{σ^{2}} \cdot {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} = : C_{λ, X_{0}^{(m)}, t}^{(m), U}, \end{matrix}

(141)

where we define analogously to (98) to (101)

\begin{matrix} {\underset{̲}{ζ}}_{n}^{(m)} & : = & Γ_{<}^{(m)} \cdot \frac{{(d^{(m), T})}^{n - 1}}{1 - d^{(m), T}} \cdot (1 - {(d^{(m), T})}^{n}) > 0, \end{matrix}

(142)

\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(m)} & : = & Γ_{<}^{(m)} \cdot \frac{1 - {(d^{(m), T})}^{n}}{{(1 - d^{(m), T})}^{2}} \cdot [1 - \frac{d^{(m), T} (1 + {(d^{(m), T})}^{n})}{1 + d^{(m), T}}] > 0, \end{matrix}

(143)

\begin{matrix} {\bar{ζ}}_{n}^{(m)} & : = & Γ_{<}^{(m)} \cdot [\frac{{(d^{(m), S})}^{n} - {(d^{(m), T})}^{n}}{d^{(m), S} - d^{(m), T}} - {(d^{(m), S})}^{n - 1} \cdot \frac{1 - {(d^{(m), T})}^{n}}{1 - d^{(m), T}}] > 0, \end{matrix}

(144)

\begin{matrix} {\bar{ϑ}}_{n}^{(m)} & : = & Γ_{<}^{(m)} \cdot \frac{d^{(m), T}}{1 - d^{(m), T}} \cdot [\frac{1 - {(d^{(m), S} d^{(m), T})}^{n}}{1 - d^{(m), S} d^{(m), T}} - \frac{{(d^{(m), S})}^{n} - {(d^{(m), T})}^{n}}{d^{(m), S} - d^{(m), T}}] > 0 . \end{matrix}

(145)

Notice that (140) and (141) simplify significantly for

(κ_{A}, κ_{H}, η, λ) \in {\tilde{P}}_{N I} \times] 0, 1 [

for which

η = 0

holds.

(b) For all

(κ_{A}, κ_{H}, η, λ) \in ({\tilde{P}}_{NI} \cup {\tilde{P}}_{SP, 1}) \times] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

and all initial population sizes

X_{0}^{(m)} \in N

the Hellinger integral bounds (140) and (141) are valid for all sufficiently large

m \in \bar{N}

, where the expressions (142) to (145) have to be replaced by

\begin{matrix} {\underset{̲}{ζ}}_{n}^{(m)} & : = & Γ_{>}^{(m)} \cdot \frac{{(d^{(m), T})}^{n} - {(d^{(m), S})}^{2 n}}{d^{(m), T} - {(d^{(m), S})}^{2}} > 0, \end{matrix}

(146)

\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(m)} & : = & \frac{Γ_{>}^{(m)}}{d^{(m), T} - {(d^{(m), S})}^{2}} \cdot [\frac{d^{(m), T} \cdot (1 - {(d^{(m), T})}^{n})}{1 - d^{(m), T}} - \frac{{(d^{(m), S})}^{2} \cdot (1 - {(d^{(m), S})}^{2 n})}{1 - {(d^{(m), S})}^{2}}] > 0, \\ {\bar{ζ}}_{n}^{(m)} & : = & Γ_{>}^{(m)} \cdot {(d^{(m), S})}^{n - 1} \cdot [n - \frac{1 - {(d^{(m), T})}^{n}}{1 - d^{(m), T}}] > 0, \end{matrix}

(147)

\begin{matrix} {\bar{ϑ}}_{n}^{(m)} & : = & Γ_{>}^{(m)} \cdot [\frac{d^{(m), S} - d^{(m), T}}{{(1 - d^{(m), S})}^{2} (1 - d^{(m), T})} \cdot (1 - {(d^{(m), S})}^{n}) \end{matrix}

(148)

\begin{matrix} + \frac{d^{(m), T} (1 - {(d^{(m), S} d^{(m), T})}^{n})}{(1 - d^{(m), T}) (1 - d^{(m), S} d^{(m), T})} - \frac{{(d^{(m), S})}^{n}}{1 - d^{(m), S}} \cdot n] . \end{matrix}

(149)

Let us finally present the desired assertions on the limits of the bounds given in Corollary 14 as the approximation step m tends to infinity, by employing for

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [⊋ [0, 1]

the quantities

κ_{λ} : = λ κ_{A} + (1 - λ) κ_{H} as well as Λ_{λ} : = \sqrt{λ κ_{A}^{2} + (1 - λ) κ_{H}^{2}},

(150)

for which the following relations hold:

\begin{matrix} Λ_{λ} > κ_{λ} > 0, & for λ \in] 0, 1 [, \end{matrix}

(151)

\begin{matrix} 0 < Λ_{λ} < κ_{λ}, & for λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1] . \end{matrix}

(152)

Theorem 11.

Let the initial SDE-value

{\tilde{X}}_{0} \in] 0, \infty [

be arbitrary but fixed, and suppose that

{lim}_{m \to \infty} \frac{1}{m} X_{0}^{(m)} = {\tilde{X}}_{0}

. Then, for all

(κ_{A}, κ_{H}, η, λ) \in ({\tilde{P}}_{N I} \cup {\tilde{P}}_{S P, 1}) \times] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ {0, 1}

and all

t \in [0, \infty [

the Hellinger integral limit can be bounded by

\begin{matrix} D_{λ, {\tilde{X}}_{0}, t}^{L} & : = & exp {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [{\tilde{X}}_{0} - \frac{η}{Λ_{λ}}] (1 - e^{- Λ_{λ} \cdot t}) - \frac{η}{σ^{2}} (Λ_{λ} - κ_{λ}) \cdot t \\ + L_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} + \frac{η}{σ^{2}} \cdot L_{λ}^{(2)} (t)} \end{matrix}

(153)

\begin{matrix} \leq & lim_{m \to \infty} H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) \\ \leq & exp {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [{\tilde{X}}_{0} - \frac{η}{\frac{1}{2} (Λ_{λ} + κ_{λ})}] (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}) - \frac{η}{σ^{2}} (Λ_{λ} - κ_{λ}) \cdot t \\ - U_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} - \frac{η}{σ^{2}} \cdot U_{λ}^{(2)} (t)} = : d_{λ, {\tilde{X}}_{0}, t}^{U}, \end{matrix}

(154)

where for the (sub)case of all

λ \in] 0, 1 [

and all

t \geq 0

\begin{matrix} L_{λ}^{(1)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot Λ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- Λ_{λ} \cdot t}), \end{matrix}

(155)

\begin{matrix} L_{λ}^{(2)} (t) & : = & \frac{1}{4} \cdot {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \cdot {(1 - e^{- Λ_{λ} \cdot t})}^{2}, \end{matrix}

(156)

\begin{matrix} U_{λ}^{(1)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{σ^{2}} \cdot [\frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} - e^{- Λ_{λ} \cdot t}}{Λ_{λ} - κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} (1 - e^{- Λ_{λ} \cdot t})}{2 \cdot Λ_{λ}}], \end{matrix}

(157)

\begin{matrix} U_{λ}^{(2)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{Λ_{λ}} \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{3 Λ_{λ} + κ_{λ}} + \frac{e^{- Λ_{λ} \cdot t} - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} - κ_{λ}}], \end{matrix}

(158)

and for the remaining (sub)case of all

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

and all

t \geq 0

\begin{matrix} L_{λ}^{(1)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot κ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- κ_{λ} \cdot t}), \end{matrix}

(159)

\begin{matrix} L_{λ}^{(2)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot κ_{λ}} \cdot [\frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}} - \frac{1 - e^{- (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}}], \end{matrix}

(160)

\begin{matrix} U_{λ}^{(1)} (t) & : = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot σ^{2}} \cdot e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} \cdot [t - \frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}}], \end{matrix}

(161)

\begin{matrix} U_{λ}^{(2)} (t) & : = & {(Λ_{λ} - κ_{λ})}^{2} \cdot [\frac{(Λ_{λ} - κ_{λ}) (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t})}{Λ_{λ} \cdot {(Λ_{λ} + κ_{λ})}^{2}} + \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} \cdot (3 Λ_{λ} + κ_{λ})} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}} \cdot t] . \end{matrix}

(162)

Notice that the components

L_{λ}^{(i)} (t)

and

U_{λ}^{(i)} (t)

(for

i = 1, 2

and in both cases

λ \in] 0, 1 [

and

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

) are strictly positive for

t > 0

and do not depend on the parameter η. Furthermore, the bounds

d_{λ, {\tilde{X}}_{0}, t}^{L}

and

d_{λ, {\tilde{X}}_{0}, t}^{U}

simplify significantly in the case

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{N I}

, for which

η = 0

holds.

This will be proved in Appendix A.4. For the time-asymptotics, we obtain the

Corollary 15.

Let the initial SDE-value

{\tilde{X}}_{0} \in] 0, \infty [

be arbitrary but fixed, and suppose that

{lim}_{m \to \infty} \frac{1}{m} X_{0}^{(m)} = {\tilde{X}}_{0}

. Then:

(a) For all

(κ_{A}, κ_{H}, η, λ) \in {\tilde{P}}_{N I} \times] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ {0, 1}

the Hellinger integral limit converges to

lim_{t \to \infty} lim_{m \to \infty} log (H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})) = - \frac{{\tilde{X}}_{0}}{σ^{2}} \cdot (Λ_{λ} - κ_{λ}) \{\begin{matrix} < 0, & for λ \in] 0, 1 [, \\ > 0, & for λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1] . \end{matrix}

(b) For all

(κ_{A}, κ_{H}, η, λ) \in {\tilde{P}}_{S P, 1} \times] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ {0, 1}

the Hellinger integral limit possesses the asymptotical behaviour

lim_{t \to \infty} \frac{1}{t} log (lim_{m \to \infty} H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})) = - \frac{η}{σ^{2}} \cdot (Λ_{λ} - κ_{λ}) \{\begin{matrix} < 0, & for λ \in] 0, 1 [, \\ > 0, & for λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1] . \end{matrix}

The assertions of Corollary 15 follow immediately by inspecting the expressions in the exponential of (153) and (154) in combination with (155) to (162).

7.3. Bounds of Power Divergences for Diffusion Approximations

Analogously to Section 4 (see especially Section 4.1), for orders

λ \in R \ {0, 1}

all the results of the previous Section 7.2 carry correspondingly over from (limits of) bounds of the Hellinger integrals

H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

to (limits of) bounds of the total variation distance

V (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

(by virtue of (12)), to (limits of) bounds of the Renyi divergences

R_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

(by virtue of (7)) as well as to (limits of) bounds of the power divergences

I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

(by virtue of (2)). For the sake of brevity, the–merely repetitive–exact details are omitted. Moreover, by combining the outcoming results on the above-mentioned power divergences with parts of the Bayesian-decision-making context of Section 2.5, we obtain corresponding assertions on (i) the (cf. (21)) weighted-average decision risk reduction (weighted-average statistical information measure) about the degree of evidence

d e g

concerning the parameter

θ

that can be attained by observing the GWI-path

X_{n}

until stage n, as well as (ii) the (cf. (22)) limit decision risk reduction (limit statistical information measure).

In the following, let us concentrate on the derivation of the Kullback-Leibler information divergence KL (relative entropy) within the current diffusion-limit framework. Notice that altogether we face two limit procedures simultaneously: by the first limit

{lim}_{λ ↑ 1} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

we obtain the KL

I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

for every fixed approximation step

m \in \bar{N}

; on the other hand, for each fixed

λ \in] 0, 1 [

, the second limit

{lim}_{m \to \infty} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

describes the limit of the power divergence – as the sequence of rescaled and continuously interpolated GW(I)’s

{({({\tilde{X}}_{s}^{(m)})}_{s \in [0, \infty [})}_{m \in \bar{N}}

(equipped with probability law

P_{A, ⌊σ^{2} m t⌋}^{(m)}

resp.

P_{H, ⌊σ^{2} m t⌋}^{(m)}

up to time

⌊σ^{2} m t⌋

) converges weakly to the continuous-time CIR-type diffusion process

{({\tilde{X}}_{s})}_{s \in [0, \infty [}

(with probability law

{\tilde{P}}_{A, t}

resp.

{\tilde{P}}_{H, t}

up to time t). In Appendix A.4 we shall prove that these two limits can be interchanged:

Theorem 12.

Let the initial SDE-value

{\tilde{X}}_{0} \in] 0, \infty [

be arbitrary but fixed, and suppose that

{lim}_{m \to \infty} \frac{1}{m} X_{0}^{(m)} = {\tilde{X}}_{0}

. Then, for all

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{N I} \cup {\tilde{P}}_{S P, 1}

and all

t \in [0, \infty [

, one gets the Kullback-Leibler information divergence (relative entropy) convergences

\begin{matrix} lim_{m \to \infty} I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = lim_{m \to \infty} lim_{λ ↗ 1} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) \\ = \{\begin{matrix} \frac{{(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}} \cdot [({\tilde{X}}_{0} - \frac{η}{κ_{A}}) \cdot (1 - e^{- κ_{A} \cdot t}) + η \cdot t], & if κ_{A} > 0, \\ \frac{κ_{H}^{2}}{2 σ^{2}} \cdot [\frac{η}{2} \cdot t^{2} + {\tilde{X}}_{0} \cdot t], & if κ_{A} = 0, \end{matrix} \\ = lim_{λ ↗ 1} lim_{m \to \infty} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) . \end{matrix}

(163)

This immediately leads to the following

Corollary 16.

Let the initial SDE-value

{\tilde{X}}_{0} \in] 0, \infty [

be arbitrary but fixed, and suppose that

{lim}_{m \to \infty} \frac{1}{m} X_{0}^{(m)} = {\tilde{X}}_{0}

. Then, the KL limit (163) possesses the following time-asymptotical behaviour:

(a) For all

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{N I}

(i.e.,

η = 0

) one gets

\begin{matrix} (i) & in the case κ_{A} > 0 & lim_{t \to \infty} lim_{m \to \infty} I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = \frac{{\tilde{X}}_{0} \cdot {(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}}, \\ (ii) & in the case κ_{A} = 0 & lim_{t \to \infty} lim_{m \to \infty} \frac{1}{t} \cdot I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = \frac{{\tilde{X}}_{0} \cdot κ_{H}^{2}}{4 σ^{2}} . \end{matrix}

(b) For all

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{S P, 1}

(i.e.,

η > 0

) one gets

\begin{matrix} (i) & in the case κ_{A} > 0 & lim_{t \to \infty} lim_{m \to \infty} \frac{1}{t} \cdot I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = \frac{η \cdot {(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}}, \\ (ii) & in the case κ_{A} = 0 & lim_{t \to \infty} lim_{m \to \infty} \frac{1}{t^{2}} \cdot I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = \frac{η \cdot κ_{H}^{2}}{4 σ^{2}} . \end{matrix}

Remark 9.

In Appendix A.4 we shall see that the proof of the last (limit-interchange concerning) equality in (163) relies heavily on the use of the extra terms

L_{λ}^{(1)} (t), L_{λ}^{(2)} (t), U_{λ}^{(1)} (t), U_{λ}^{(2)} (t)

in (153) and (154). Recall that these terms ultimately stem from (manipulations of) the corresponding parts of the “improved closed-form bounds” in Theorem 5, which were derived by using the linear inhomogeneous difference equations

{\underset{̲}{a}}_{n}^{(q)}

resp.

{\bar{a}}_{n}^{(q)}

(cf. (92) resp. (94)) instead of the linear homogeneous difference equations

a_{n}^{(q), T}

resp.

a_{n}^{(q), S}

(cf. (78) resp. (79)) as explicit approximates of the sequence

a_{n}^{(q)}

. Not only this fact shows the importance of this more tedious approach.

Interesting comparisons of the above-mentioned results in Section 7.2 and Section 7.3 with corresponding information measures of the solutions of the SDE (129) themselves (rather their branching approximations), can be found in Kammerer [157].

7.4. Applications to Decision Making

Analogously to Section 6.7, the above-mentioned investigations of the Section 7.1, Section 7.2 and Section 7.3 can be applied to the context of Section 2.5 on dichotomous decision making about GW(I)-type diffusion approximations of solutions of the stochastic differential Equation (129). For the sake of brevity, the–merely repetitive–exact details are omitted.

Author Contributions

Conceptualization, N.B.K. and W.S.; Formal analysis, N.B.K. and W.S.; Methodology, N.B.K. and W.S.; Visualization, N.B.K.; Writing, N.B.K. and W.S. All authors have read and agreed to the published version of the manuscript.

Funding

Niels B. Kammerer received a scholarship of the “Studienstiftung des Deutschen Volkes” for his PhD Thesis.

Acknowledgments

We are very grateful to the referees for their patience to review this long manuscript, and for their helpful suggestions. Moreover, we would like to thank Andreas Greven for some useful remarks.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs and Auxiliary Lemmas

Appendix A.1. Proofs and Auxiliary Lemmas for Section 3

Lemma A1.

For all real numbers

x, y, z > 0

and all

λ \in R

one has

x^{λ} y^{1 - λ} - (λ x z^{λ - 1} + (1 - λ) y z^{λ}) \{\begin{matrix} \leq 0, & for λ \in] 0, 1 [, \\ = 0, & for λ \in {0, 1}, \\ \geq 0, & for λ \in R \ [0, 1], \end{matrix}

with equality in the cases

λ \in R \ {0, 1}

iff

\frac{x}{y} = z

.

Proof of Lemma A1.

For fixed

\tilde{x} : = x z^{λ - 1} > 0

,

\tilde{y} : = y z^{λ} > 0

with

\tilde{x} \neq \tilde{y}

we inspect the function g on

R

defined by

g (λ) : = {\tilde{x}}^{λ} {\tilde{y}}^{1 - λ} - (λ \tilde{x} + (1 - λ) \tilde{y})

which satisfies

g (0) = g (1) = 0

,

g^{'} (0) = \tilde{y} log (\tilde{x} / \tilde{y}) - (\tilde{x} - \tilde{y}) < \tilde{y} ((\tilde{x} / \tilde{y}) - 1) - (\tilde{x} - \tilde{y}) = 0

and which is strictly convex. Thus, the assertion follows immediately by taking into account the obvious case

\tilde{x} = \tilde{y}

. □

Proof of Properties 1.

Property (P9) is trivially valid. To show (P1) we assume

0 < q < β_{λ}

, which implies

a_{1}^{(q)} = ξ_{λ}^{(q)} (0) = q - β_{λ} < 0

. By induction,

{(a_{n})}_{n \in N}

is strictly negative and strictly decreasing. As stated in (P9), the function

ξ_{λ}^{(q)}

is strictly increasing, strictly convex and converges to

- β_{λ}

for

x \to - \infty

. Thus, it hits the straight line

i d (x) = x

once and only once on the negative real line at

x_{0}^{(q)} \in] - β_{λ}, 0 [

(cf. (44)). This implies that the sequence

{(a_{n}^{(q)})}_{n \in N}

converges to

x_{0}^{(q)} \in] - β_{λ}, q - β_{λ} [

. Property (P2) follows immediately. In order to prove (P3), let us fix

q > max {0, β_{λ}}

, implying

a_{1}^{(q)} = ξ_{λ}^{(q)} (0) = q - β_{λ} > 0

; notice that in this setup, the special choice

q = 1

implies

min {1, e^{β_{λ} - 1}} = e^{β_{λ} - 1} < q

. By induction,

{(a_{n}^{(q)})}_{n \in N}

is strictly positive and strictly increasing. Since

{lim}_{x \to \infty} ξ_{λ}^{(q)} (x) = \infty

, the function

ξ_{λ}^{(q)}

does not necessarily hit the straight line

i d (x) = x

on the positive real line. In fact, due to strict convexity (cf. (P9)), this is excluded if

ξ_{λ}^{(q)'} (0) = q \geq 1

. Suppose that

q < 1

. To prove that there exists a positive solution of the equation

ξ_{λ}^{(q)} (x) = x

it is sufficient to show that the unique global minimum of the strict convex function

h_{λ}^{(q)} (x) : = ξ_{λ}^{(q)} (x) - x

is taken at some point

x_{0} \in] 0, \infty [

and that

h_{λ}^{(q)} (x_{0}) \leq 0

. It holds

h_{λ}^{(q)'} (x) = q \cdot e^{x} - 1

, and therefore

h_{λ}^{(q)'} (x) = 0

iff

x = x_{0} = - log q

. We have

h_{λ}^{(q)} (- log q) = 1 - β_{λ} + log q

, which is less or equal to zero iff

q \leq e^{β_{λ} - 1}

. It remains to show that for

q > β_{λ}

and

q > min \{1, e^{β_{λ} - 1}\}

the sequence

{(a_{n}^{(q)})}_{n \in N}

grows faster than exponentially, i.e., there do not exist constants

c_{1}, c_{2} \in R

such that

a_{n}^{(q)} \leq e^{c_{1} + c_{2} n}

for all

n \in N

. We already know that (in the current case)

a_{n}^{(q)} \overset{n \to \infty}{⟶} \infty

. Notice that it is sufficient to verify

{lim sup}_{n \to \infty} (log (a_{n + 1}^{(q)}) - log (a_{n}^{(q)})) = \infty

. For the case

β_{λ} \geq 0

the latter is obtained by

\begin{matrix} log (a_{n + 1}^{(q)}) - log (a_{n}^{(q)}) & = & log ((q - β_{λ}) e^{a_{n}^{(q)}} + β_{λ} (e^{a_{n}^{(q)}} - 1)) - log (q e^{a_{n - 1}^{(q)}} - β_{λ}) \\ \geq & (log (q - β_{λ}) - log (q)) + (q e^{a_{n - 1}^{(q)}} - β_{λ} - a_{n - 1}^{(q)}) \overset{a_{n - 1}^{(q)} \to \infty}{⟶} \infty . \end{matrix}

An analogous consideration works out for the case

β_{λ} < 0

. Property (P4) is trivial, and (P5) to (P8) are direct implications of the already proven properties (P1) to (P4). □

Proof of Lemma 1.

(a) Let

β_{A} > 0

,

β_{H} > 0

with

β_{A} \neq β_{H}

,

λ \in R \] 0, 1 [

,

β_{λ} : = λ β_{A} + (1 - λ) β_{H}

and

q_{λ} : = β_{A}^{λ} β_{H}^{1 - λ} > max {0, β_{λ}}

(cf. Lemma A1). Below, we follow the lines of Linkov & Lunyova [53], appropriately adapted to our context. We have to find those

λ \in R \] 0, 1 [

for which the following two conditions hold:

(i): $q_{λ} \leq 1$ , i.e., $ξ_{λ}^{(q_{λ})'} (0) \leq 1$ ,
(ii): $q_{λ} \leq e^{β_{λ} - 1}$ (cf.(P3a)), which is equivalent with the existence of a–positive, if (i) is satisfied,–solution of the equation $ξ_{λ}^{(q_{λ})} (x) = x$ .

Notice that the case

q_{λ} = 1

,

λ \in R \ [0, 1]

, cannot appear in (i), provided that (ii) holds (since due to Lemma A1

e^{β_{λ} - 1} < e^{q_{λ} - 1} = 1

). For (i), it is easy to check that we have to require

λ \{\begin{matrix} < & \frac{log (β_{H})}{log (β_{H} / β_{A})}, if β_{A} > β_{H}, \\ > & \frac{log (β_{H})}{log (β_{H} / β_{A})}, if β_{A} < β_{H} . \end{matrix}

(A1)

To proceed, straightforward analysis leads to

- log (q_{λ}) = arg {min}_{x \in R} \{ξ_{λ}^{(q_{λ})} (x) - x\}

. To check (ii), we first notice that

q_{λ} \leq e^{β_{λ} - 1}

iff

ξ_{λ}^{(q_{λ})} (x) - x \leq 0

for some

x \in R

. Hence, we calculate

\begin{matrix} ξ_{λ}^{(q_{λ})} (- log (q_{λ})) + log (q_{λ}) \leq 0 ⟺ 1 - λ (β_{A} - β_{H}) - β_{H} + λ log (\frac{β_{A}}{β_{H}}) + log (β_{H}) \leq 0 \\ ⟺ λ \cdot [β_{H} (1 - \frac{β_{A}}{β_{H}}) + log (\frac{β_{A}}{β_{H}})] \leq β_{H} - 1 - log (β_{H}) . \end{matrix}

(A2)

In order to isolate

λ

in (A2), one has to find out for which

(β_{A}, β_{H})

the term in the square bracket is positive resp. zero resp. negative. To achieve this, we aim for the substitutions

x : = β_{A} / β_{H}

,

β = β_{H}

and thus study first the auxiliary function

h_{β} (x) : = log (x) - β (x - 1)

,

x > 0

, with fixed parameters

β > 0

. Straightforwardly, we obtain

h_{β}^{'} (x) = x^{- 1} - β

and

h_{β}^{''} (x) = - x^{- 2}

. Thus, the function

h_{β} (\cdot)

is strictly concave and attains a maximum at

x = β^{- 1}

. Since additionally

h_{β} (1) = 0

and

h_{β}^{'} (1) = 1 - β

, there exists a second solution

z (β) \neq 1

of the equation

h_{β} (x) = 0

iff

β \neq 1

. Thus, one gets

for $β = 1$ : for all $x > 0$ there holds $h_{β} (x) \leq 0$ , with equality iff $x = β^{- 1}$ ,
for $β < 1$ : $h_{β} (x) \geq 0$ iff $x \in [1, z (β)]$ , with equality iff $x \in {1, z (β)}$ (notice that $z (β) > 1$ ),
for $β > 1$ : $h_{β} (x) \geq 0$ iff $x \in [z (β), 1]$ , with equality iff $x \in {z (β), 1}$ (notice that $z (β) < 1$ ).

Suppose that

λ < 0

.

Case 1: If

β_{H} = 1

, then condition (ii) is not satisfied whenever

β_{A} \neq β_{H}

, since the right side of (A2) is equal to zero and the left side is strictly greater than zero. Hence,

λ_{-} = 0

.

Case 2: Let

β_{H} > 1

. If

β_{A} < β_{H}

, then condition (i) is not satisfied and hence

λ_{-} = 0

. If

β_{A} > β_{H}

, then condition (i) is satisfied iff

λ < \overset{˘}{\overset{˘}{λ}} : = \overset{˘}{\overset{˘}{λ}} (β_{A}, β_{H}) : = \frac{log (β_{H})}{log (β_{H} / β_{A})} < 0

. On the other hand, incorporating the discussion of the function

h_{β} (\cdot)

, we see that

h_{β_{H}} (\frac{β_{A}}{β_{H}}) < 0

. Thus, (A2) implies that condition (ii) is satisfied when

λ \geq \overset{˘}{λ} : = \overset{˘}{λ} (β_{A}, β_{H}) : = \frac{β_{H} - 1 - log (β_{H})}{β_{H} - β_{A} + log (\frac{β_{A}}{β_{H}})}

. We claim that

\overset{˘}{\overset{˘}{λ}} < \overset{˘}{λ}

and conclude that the conditions (i) and (ii) are not fulfilled jointly, which leads to

λ_{-} = 0

. To see this, we notice that due to

1 < β_{H} < β_{A}

we get

log (β_{A}) / (β_{A} - 1) < log (β_{H}) / (β_{H} - 1)

and thus

\begin{matrix} log (β_{A}) (β_{H} - 1) < log (β_{H}) (β_{A} - 1) \\ ⟺ & β_{H} log (β_{H}) - β_{A} log (β_{H}) < β_{H} log (β_{H}) - β_{H} log (β_{A}) - log (β_{H}) + log (β_{A}) \\ ⟺ & log (β_{H}) (β_{H} - β_{A}) + log (β_{H}) log (\frac{β_{A}}{β_{H}}) < log (\frac{β_{H}}{β_{A}}) (β_{H} - 1) + log (β_{H}) log (\frac{β_{A}}{β_{H}}) \\ ⟺ & \frac{log (β_{H})}{log (\frac{β_{H}}{β_{A}})} < \frac{β_{H} - 1 - log (β_{H})}{β_{H} - β_{A} + log (\frac{β_{A}}{β_{H}})} ⟺ \overset{˘}{\overset{˘}{λ}} < \overset{˘}{λ} . \end{matrix}

(A3)

Case 3: Let

β_{H} < 1

. For this, one gets

h_{β_{H}} (\frac{β_{A}}{β_{H}}) \geq 0

for

β_{A} \in] β_{H}, β_{H} z (β_{H})]

. Hence, condition (ii) is satisfied if either

β_{A} \in] β_{H}, β_{H} z (β_{H})]

, or

β_{A} \notin] β_{H}, β_{H} z (β_{H})]

and

λ \geq \overset{˘}{λ}

. If

β_{A} > β_{H} z (β_{H})

, then condition (i) is trivially satisfied for all

λ < 0

. In the case

β_{A} < β_{H}

, condition (i) is satisfied whenever

λ > \overset{˘}{\overset{˘}{λ}}

. Notice that since

0 < β_{A} < β_{H} < 1

, an analogous consideration as in (A3) leads to

\overset{˘}{\overset{˘}{λ}} < \overset{˘}{λ}

. This implies that

λ_{-} = \overset{˘}{λ}

. The last case

β_{A} \in] β_{H}, β_{H} z (β_{H})]

is easy to handle: since

\frac{log (β_{H})}{log (β_{H} / β_{A})} > 0

as well as

z_{β_{H}} (\frac{β_{A}}{β_{H}}) > 0

, both conditions (i) and (ii) hold trivially.

The representation of

λ_{+}

follows straightforwardly from the

λ_{-}

-result and the skew symmetry (8), by employing

1 - \overset{˘}{λ} (β_{H}, β_{A}) = \overset{˘}{λ} (β_{A}, β_{H})

. Alternatively, one can proceed analogously to the

λ_{-}

-case.

Part (b) is much easier to prove: if

β_{•} : = β_{A} = β_{H} > 0

, then for all

λ \in R \ [0, 1]

one gets

q_{λ} = β_{A}^{λ} β_{H}^{1 - λ} = β_{•}

as well as

β_{λ} = β_{•}

. Hence, Properties 1 (P2) implies that

a_{n}^{(q_{λ})} \equiv 0

and thus it is convergent, independently of the choice

λ \in R \ [0, 1]

. □

Proof of Formula (51).

For the parameter constellation in Section 3.10, we employ as upper bound for

ϕ_{λ} (x)

(

x \in N_{0}

) the function

\bar{ϕ_{λ}} (x) : = \{\begin{matrix} ϕ_{λ} (0), & if x = 0, \\ 0, & if x > 0 . \end{matrix}

Notice that this method is rather crude, and gives in the other cases treated in the Section 3.7, Section 3.8 and Section 3.9 worse bounds than those derived there. Since

λ \in] 0, 1 [

and

α_{A} \neq α_{H}

, one has

ϕ_{λ} (0) < 0

. In order to derive an upper bound of the Hellinger integral, we first set

\bar{ϵ} : = 1 - e^{ϕ_{λ} (0)} \in] 0, 1 [

. Hence, for all

n \in N \ {1}

we obtain the auxiliary expression

\begin{matrix} \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{ϕ_{λ} (x_{n - 1})\} \leq \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{\bar{ϕ_{λ}} (x_{n - 1})\} \\ = exp \{φ_{λ} (x_{n - 2})\} - \bar{ϵ} = exp \{φ_{λ} (x_{n - 2})\} \cdot [1 - \bar{ϵ} \cdot exp \{- φ_{λ} (x_{n - 2})\}] . \end{matrix}

Moreover, since

β_{A} \neq β_{H}

, one gets

{lim}_{x \to \infty} ϕ_{λ} (x) = - \infty

(cf. Properties 3 (P20) and Lemma A1). This–together with the nonnegativity of

φ_{λ} (\cdot)

–implies

sup_{x \in N_{0}} \{exp \{ϕ_{λ} (x)\} \cdot [1 - \bar{ϵ} \cdot exp \{- φ_{λ} (x)\}]\} = : \bar{δ} \in] 0, 1 [.

Incorporating these considerations as well as the formulas (27) to (32), we get for

n = 1

the relation

H_{λ} (P_{A, n} ∥ P_{H, n}) = exp {ϕ_{λ} (x_{0})} \leq 1

(with equality iff

x_{0} = x^{*} = \frac{α_{A} - α_{H}}{β_{H} - β_{A}}

), and–as a continuation of formula (29)– for all

n \in N \ {1}

(recall that

\vec{x} : = (x_{0}, x_{1}, \dots) \in Ω

)

\begin{matrix} H_{λ} (P_{A, n} ∥ P_{H, n}) = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n} = 0}^{\infty} \prod_{k = 1}^{n} Z_{n, k}^{(λ)} (\vec{x}) \\ = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 1} = 0}^{\infty} \prod_{k = 1}^{n - 1} Z_{n, k}^{(λ)} (\vec{x}) \\ \cdot exp \{{(f_{A} (x_{n - 1}))}^{λ} {(f_{H} (x_{n - 1}))}^{(1 - λ)} - (λ f_{A} (x_{n - 1}) + (1 - λ) f_{H} (x_{n - 1}))\} \\ = \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{- f_{λ} (x_{n - 2})\} \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp {ϕ_{λ} (x_{n - 1})} \\ \leq \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \cdot exp \{ϕ_{λ} (x_{n - 2})\} \cdot [1 - \bar{ϵ} \cdot exp \{- φ_{λ} (x_{n - 2})\}] \\ \leq \bar{δ} \cdot \sum_{x_{1} = 0}^{\infty} \dots \sum_{x_{n - 2} = 0}^{\infty} \prod_{k = 1}^{n - 2} Z_{n, k}^{(λ)} (\vec{x}) \leq \dots \leq {\bar{δ}}^{⌊n / 2⌋} . \end{matrix}

(A4)

Hence,

H_{λ} (P_{A, n} ∥ P_{H, n}) < 1

for (at least) all

n \in N \ {1}

, and

{lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0

. □

Notice that the above proof method of formula (51) does not work for the parameter setup in Section 3.11, because there one gets

\bar{δ} = {sup}_{x \in N_{0}} \{exp \{ϕ_{λ} (x)\} \cdot [1 - \bar{ϵ} \cdot exp \{- φ_{λ} (x)\}]\} = 1

.

Proof of Proposition 9.

In the setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [

we require

β_{•} : = β_{A} = β_{H} < 1

. As a linear upper bound for

ϕ_{λ} (\cdot)

, we employ the tangent line at

y \geq 0

(cf. (52))

ϕ_{λ, y}^{\tan} (x) : = (p_{y} - α_{λ}) + (q_{y} - β_{•}) \cdot x : = (p_{λ, y}^{\tan} - α_{λ}) + (q_{λ, y}^{\tan} - β_{λ}) \cdot x : = (ϕ_{λ} (y) - y \cdot ϕ_{λ}^{'} (y)) + ϕ_{λ}^{'} (y) \cdot x .

(A5)

Since in the current setup

P_{SP, 4 a}

the function

ϕ_{λ} (\cdot)

is strictly increasing, the slope

ϕ_{λ}^{'} (y)

of the tangent line at y is positive. Thus we have

q_{y} > β_{λ}

and Properties 1 (P3) implies that the sequence

{(a_{n}^{(q_{y})})}_{n \in N}

is strictly increasing and converges to

x_{0}^{(q_{y})} \in] 0, - log (q_{y})]

iff

q_{y} \leq min {1, e^{β_{•} - 1}} = e^{β_{•} - 1} < 1

(cf. (P3a)), where

x_{0}^{(q_{y})}

is the smallest solution of the equation

ξ_{λ}^{(q_{y})} (x) = q_{y} \cdot e^{x} - β_{•} = x

. Since

q_{y} ↘ β_{•}

for

y \to \infty

(cf. Properties 3 (P18)) and additionally

e^{β_{•} - 1} > β_{•}

, there exists a large enough

y \geq 0

such that the sequence

{(a_{n}^{(q_{y})})}_{n \in N}

converges. If this y is also large enough to additionally guarantee

h (y) < 0

for

h (y) : = lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p_{y}, q_{y})}) = p_{y} \cdot e^{x_{0}^{(q_{y})}} - α_{λ},

then one can conclude that

{lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0

. As a first step, for verifying

h (y) < 0

we look for an upper bound

{\bar{x}}_{0}^{(q_{y})}

for the fixed point

x_{0}^{(q_{y})}

where the latter exists for

y \geq y_{1}

(say). Notice that

{\bar{Q}}_{λ}^{(q_{y})} (x) : = \frac{1}{2} x^{2} + q_{y} x + q_{y} - β_{•} \geq q_{y} \cdot e^{x} - β_{•} = ξ_{λ}^{(q_{y})} (x),

(A6)

since

{\bar{Q}}_{λ}^{(q_{y})} (0) = ξ_{λ}^{(q_{y})} (0)

,

{\bar{Q}}_{λ}^{(q_{y})'} (0) = ξ_{λ}^{(q_{y})'} (0)

and

{\bar{Q}}_{λ}^{(q_{y})''} (x) \geq ξ_{λ}^{(q_{y})''} (x)

for

x \in [0, - log (q_{y})]

. For sufficiently large

y \geq y_{2} \geq y_{1}

(say), we easily obtain the smaller solution of

{\bar{Q}}_{λ}^{(q_{y})} (x) = x

as

{\bar{x}}_{0}^{(q_{y})} = (1 - q_{y}) - \sqrt{{(1 - q_{y})}^{2} - 2 (q_{y} - β_{•})} = (1 - ϕ_{λ}^{'} (y) - β_{•}) - \sqrt{{(1 - ϕ_{λ}^{'} (y) - β_{•})}^{2} - 2 ϕ_{λ}^{'} (y)} \geq x_{0}^{(q_{y})}

(A7)

where the expression in the root is positive since

q_{y} ↘ β_{•}

for

y \to \infty

. We now have

h (y) = p_{y} \cdot e^{x_{0}^{(q_{y})}} - α_{λ} \leq p_{y} \cdot e^{{\bar{x}}_{0}^{(q_{y})}} - α_{λ} = : \bar{h} (y), \forall y \geq y_{2} .

(A8)

Hence, it suffices to show that

\bar{h} (y) < 0

for some

y \geq y_{2}

. We recall from Properties 3 (P15), (P17) and (P19) that

\begin{matrix} ϕ_{λ} (y) & = & {(α_{A} + β_{•} \cdot y)}^{λ} {(α_{H} + β_{•} \cdot y)}^{1 - λ} - λ (α_{A} + β_{•} \cdot y) - (1 - λ) (α_{H} + β_{•} \cdot y) < 0, \\ ϕ_{λ}^{'} (y) & = & λ \cdot β_{•} \cdot {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ - 1} + (1 - λ) \cdot β_{•} \cdot {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} - β_{•} > 0 and that \\ ϕ_{λ}^{''} (y) & = & - {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} \cdot \frac{λ (1 - λ) \cdot β_{•}^{2} \cdot {(α_{A} - α_{H})}^{2}}{{(α_{A} + β_{•} \cdot y)}^{2} (α_{H} + β_{•} \cdot y)} < 0, \end{matrix}

(A9)

which immediately implies

{lim}_{y \to \infty} ϕ_{λ} (y) = {lim}_{y \to \infty} ϕ_{λ}^{'} (y) = {lim}_{y \to \infty} ϕ_{λ}^{''} (y) = 0

and with l’Hospital’s rule

\begin{matrix} lim_{y \to \infty} y \cdot ϕ_{λ} (y) & = & lim_{y \to \infty} - y^{2} \cdot ϕ_{λ}^{'} (y) = lim_{y \to \infty} \frac{y^{3}}{2} \cdot ϕ_{λ}^{''} (y) \\ = & - \frac{1}{2} lim_{y \to \infty} {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} \cdot \frac{λ (1 - λ) \cdot β_{•}^{2} \cdot {(α_{A} - α_{H})}^{2}}{{(α_{A} / y + β_{•})}^{2} (α_{H} / y + β_{•})} = - \frac{1}{2} λ (1 - λ) \cdot \frac{{(α_{A} - α_{H})}^{2}}{β_{•}} . \end{matrix}

(A10)

The formulas (A5), (A7) and (A9) imply the limits

{lim}_{y \to \infty} p_{y} = α_{λ}

,

{lim}_{y \to \infty} q_{y} = β_{•}

,

{lim}_{y \to \infty} {\bar{x}}_{0}^{(q_{y})} = 0

. Notice that

p_{y} < α_{λ}

holds trivially for all

y \geq 0

since the intercept

(p_{y} - α_{λ})

of the tangent line

ϕ_{λ, y}^{\tan} (\cdot)

is negative. Incorporating (A8) we therefore obtain

{lim}_{y \to \infty} h (y) \leq {lim}_{y \to \infty} \bar{h} (y) = 0

. As mentioned before, for the proof it is sufficient to show that

\bar{h} (y) < 0

for some

y \geq y_{2}

. This holds true if

{lim}_{y \to \infty} y \cdot \bar{h} (y) < 0

. To verify this, notice first that from (A5), (A7) and (A8) we get

{\bar{h}}^{'} (y) = - p_{y} \cdot e^{{\bar{x}}_{0}^{(q_{y})}} \cdot ϕ_{λ}^{''} (y) \cdot [1 - \frac{2 - ϕ_{λ}^{'} (y) - β_{•}}{\sqrt{{(1 - q_{y})}^{2} - 2 (q_{y} - β_{•})}}] - y \cdot ϕ_{λ}^{''} (y) \cdot e^{{\bar{x}}_{0}^{(q_{y})}} \overset{y \to \infty}{⟶} 0 .

(A11)

Finally we obtain with (A10)

\begin{matrix} lim_{y \to \infty} y \cdot \bar{h} (y) & = & - lim_{y \to \infty} y^{2} \cdot {\bar{h}}^{'} (y) \\ = & lim_{y \to \infty} p_{y} \cdot e^{{\bar{x}}_{0}^{(q_{y})}} \cdot y^{2} \cdot ϕ_{λ}^{''} (y) \cdot [1 - \frac{2 - ϕ_{λ}^{'} (y) - β_{•}}{\sqrt{{(1 - q_{y})}^{2} - 2 (q_{y} - β_{•})}}] + y^{3} \cdot ϕ_{λ}^{''} (y) \cdot e^{{\bar{x}}_{0}^{(q_{y})}} \\ = & 0 - λ (1 - λ) \cdot \frac{{(α_{A} - α_{H})}^{2}}{β_{•}} < 0 . □ \end{matrix}

Proof of Corollary 1.

Part (a) follows directly from Proposition 1 (a),(b) and the limit

{lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = 0

in the respective part (c) of the Propositions 7, 8, 9 as well as from (51). To prove part (b), according to (26) we have to verify

{lim inf}_{λ ↗ 1} \{{lim inf}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n})\} = 1

. From part (c) of Proposition 2 we see that this is satisfied iff

{lim}_{λ ↑ 1} x_{0}^{(q_{λ}^{E})} = 0

. Recall that for fixed

λ \in] 0, 1 [

we have

β_{λ} = λ β_{A} + (1 - λ) β_{H} > 0

,

q_{λ}^{E} = β_{A}^{λ} β_{H}^{1 - λ} < β_{λ}

(cf. Lemma A1) and from Properties 1 (P1) the unique negative solution

x_{0}^{(q_{λ}^{E})} \in] - β_{λ}, q_{λ}^{E} - β_{λ} [

of

ξ_{λ}^{(q_{λ}^{E})} (x) = q_{λ}^{E} e^{x} - β_{λ} = x

(cf. (44)). Due to the continuity and boundedness of the map

λ \mapsto x_{0}^{(q_{λ}^{E})}

(for

λ \in [0, 1]

) one gets that

{lim}_{λ ↗ 1} x_{0}^{(q_{λ}^{E})}

exists and is the smallest nonpositive solution of

β_{A} e^{x} - β_{A} = x

. From this, the part (b) as well as the non-contiguity in part (c) follow immediately. The other part of (c) is a direct consequence of Proposition 1 (a),(b) and Proposition 2 (c). □

Proof of Formula (59).

One can proceed similarly to the proof of formula (51) above. Recall

H_{λ} (P_{A, 1} ∥ P_{H, 1}) = exp {ϕ_{λ} (X_{0})} > 1

for

X_{0} \in N

(cf. (28), Lemma A1 and

f_{A} (X_{0}) \neq f_{H} (X_{0})

for all

X_{0} \in N

). For

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times (R \ [0, 1])

one gets

ϕ_{λ} (0) = 0, ϕ_{λ} (1) > 0

, and we define for

x \geq 0

\underset{̲}{ϕ_{λ}} (x) : = \{\begin{matrix} ϕ_{λ} (1), & if x = 1, \\ 0, & if x \neq 1 . \end{matrix}

By means of the choice

\underset{̲}{ϵ} : = φ_{λ} (1) \cdot (e^{ϕ_{λ} (1)} - 1) > 0

, we obtain for all

n \in N \ {1}

\begin{matrix} \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{ϕ_{λ} (x_{n - 1})\} \geq \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{\underset{̲}{ϕ_{λ}} (x_{n - 1})\} \\ = exp \{φ_{λ} (x_{n - 2})\} + \underset{̲}{ϵ} = exp \{φ_{λ} (x_{n - 2})\} \cdot [1 + \underset{̲}{ϵ} \cdot exp \{- φ_{λ} (x_{n - 2})\}] . \end{matrix}

Incorporating

inf_{x \in N_{0}} \{exp \{ϕ_{λ} (x)\} \cdot [1 + \underset{̲}{ϵ} \cdot exp \{- φ_{λ} (x)\}]\} = : \underset{̲}{δ} > 1,

one can show analogously to (A4) that

H_{λ} (P_{A, n} ∥ P_{H, n}) \geq \dots \geq {\underset{̲}{δ}}^{⌊n / 2⌋} \overset{n \to \infty}{⟶} \infty . □

Proof of the Formulas (61), (63) and (64).

In the following, we slightly adapt the above-mentioned proof of formula (59). Let us define

\underset{̲}{ϕ_{λ}} (x) : = \{\begin{matrix} ϕ_{λ} (0), & if x = 0, \\ 0, & if x > 0 . \end{matrix}

In all respective subcases one clearly has

\underset{̲}{ϕ_{λ}} (0) = ϕ_{λ} (0) > 0

. With

\underset{̲}{ϵ} : = e^{ϕ_{λ} (0)} - 1 > 0

we obtain for all

n \in N \ {1}

\begin{matrix} \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{ϕ_{λ} (x_{n - 1})\} \geq \sum_{x_{n - 1} = 0}^{\infty} \frac{{[φ_{λ} (x_{n - 2})]}^{x_{n - 1}}}{x_{n - 1}!} \cdot exp \{\underset{̲}{ϕ_{λ}} (x_{n - 1})\} \\ = exp \{φ_{λ} (x_{n - 2})\} + \underset{̲}{ϵ} = exp \{φ_{λ} (x_{n - 2})\} \cdot [1 + \underset{̲}{ϵ} \cdot exp \{- φ_{λ} (x_{n - 2})\}] . \end{matrix}

By employing

inf_{x \in N_{0}} \{exp \{ϕ_{λ} (x)\} \cdot [1 + \underset{̲}{ϵ} \cdot exp \{- φ_{λ} (x)\}]\} = : \underset{̲}{δ} > 1,

(A12)

one can show analogously to (A4) that

H_{λ} (P_{A, n} ∥ P_{H, n}) \geq \dots \geq {\underset{̲}{δ}}^{⌊n / 2⌋} \overset{n \to \infty}{⟶} \infty .

Notice that this method does not work for the parameter cases

P_{SP, 4 a} \cup P_{SP, 4 b}

, since there the infimum in (A12) is equal to one. □

Proof of Proposition 13.

In the setup

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times (R \ [0, 1])

we require

β_{•} : = β_{A} = β_{H} < 1

. As in the proof of Proposition 9, we stick to the tangent line

ϕ_{λ, y}^{\tan} (\cdot)

at

y \geq 0

(cf. (52)) as a linear lower bound for

ϕ_{λ} (\cdot)

, i.e., we use the function

ϕ_{λ, y}^{\tan} (x) : = (p_{y} - α_{λ}) + (q_{y} - β_{•}) \cdot x : = (p_{λ, y}^{\tan} - α_{λ}) + (q_{λ, y}^{\tan} - β_{λ}) \cdot x : = (ϕ_{λ} (y) - y \cdot ϕ_{λ}^{'} (y)) + ϕ_{λ}^{'} (y) \cdot x .

(A13)

As already mentioned in Section 3.21, on

P_{SP, 4 a}

the function

ϕ_{λ} (\cdot)

is strictly decreasing and converges to 0. Thus, for all

y \geq 0

the slope

ϕ_{λ}^{'} (y)

of the tangent line at y is negative, which implies that

q_{y} < β_{λ} = β_{•}

. For

λ \in R \ [0, 1]

there clearly may hold

q_{y} < 0

for some

y \in R

. However, there exists a sufficiently large

y_{1} > 0

such that

q_{y} > 0

for all

y > y_{1}

, since

{lim}_{y \to \infty} ϕ_{λ}^{'} (y) = 0

and hence

q_{y} ↗ β_{•} > 0

for

y \to \infty

. Thus, let us suppose that

y > y_{1}

. Then, the sequence

{(a_{n}^{(q_{y})})}_{n \in N}

is strictly negative, strictly decreasing and converges to

x_{0}^{(q_{y})} \in] - β_{•}, q_{y} - β_{•} [

(cf. Properties 1 (P1)). If there is some

y \geq y_{1}

such that

h (y) > 0

with

h (y) : = lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p_{y}, q_{y})}) = p_{y} \cdot e^{x_{0}^{(q_{y})}} - α_{λ},

then one can conclude that

{lim}_{n \to \infty} H_{λ} (P_{A, n} ∥ P_{H, n}) = \infty

. Let us at first consider the case

α_{λ} \geq 0

. By employing

p_{y} ↘ α_{λ}

for

y \to \infty

, one gets

p_{y} > 0

for all

y \geq 0

. Analogously to the proof of Proposition 9, we now look for a lower bound

{\underset{̲}{x}}_{0}^{(q_{y})}

of the fixed point

x_{0}^{(q_{y})}

. Notice that

x_{0}^{(q_{y})} > - β_{•}

implies

{\underset{̲}{Q}}_{λ}^{(q_{y})} (x) : = \frac{e^{- β_{•}}}{2} \cdot q_{y} \cdot x^{2} + q_{y} \cdot x + q_{y} - β_{•} \leq q_{y} \cdot e^{x} - β_{•} = ξ_{λ}^{(q_{y})} (x),

(A14)

since

{\underset{̲}{Q}}_{λ}^{(q_{y})} (0) = ξ_{λ}^{(q_{y})} (0) < 0

,

{\underset{̲}{Q}}_{λ}^{(q_{y})'} (0) = ξ_{λ}^{(q_{y})'} (0) > 0

and

0 < {\underset{̲}{Q}}_{λ}^{(q_{y})''} (x) < ξ_{λ}^{(q_{y})''} (x)

for

x \in] - β_{•}, 0]

. Thus, the negative solution

{\underset{̲}{x}}_{0}^{(q_{y})}

of the equation

{\underset{̲}{Q}}_{λ}^{(q_{y})} (x) = x

(which definitely exists) implies that there holds

{\underset{̲}{x}}_{0}^{(q_{y})} \leq x_{0}^{(q_{y})}

. We easily obtain

\begin{matrix} {\underset{̲}{x}}_{0}^{(q_{y})} & = & \frac{e^{β_{•}}}{q_{y}} [(1 - q_{y}) - \sqrt{{(1 - q_{y})}^{2} - 2 e^{- β_{•}} q_{y} (q_{y} - β_{•})}] \\ = & \frac{e^{β_{•}}}{ϕ_{λ}^{'} (y) + β_{•}} [(1 - ϕ_{λ}^{'} (y) - β_{•}) - \sqrt{{(1 - ϕ_{λ}^{'} (y) - β_{•})}^{2} - 2 \cdot e^{- β_{•}} q_{y} \cdot ϕ_{λ}^{'} (y)}] < 0 . \end{matrix}

(A15)

Since

h (y) = p_{y} \cdot e^{x_{0}^{(q_{y})}} - α_{λ} \geq p_{y} \cdot e^{{\underset{̲}{x}}_{0}^{(q_{y})}} - α_{λ} = : \underset{̲}{h} (y),

(A16)

it is sufficient to show

\underset{̲}{h} (y) > 0

for some

y > y_{1}

. We recall from Properties 3 (P15), (P17) and (P19) that

\begin{matrix} ϕ_{λ} (y) & = & {(α_{A} + β_{•} \cdot y)}^{λ} {(α_{H} + β_{•} \cdot y)}^{1 - λ} - λ (α_{A} + β_{•} \cdot y) - (1 - λ) (α_{H} + β_{•} \cdot y) > 0, \\ ϕ_{λ}^{'} (y) & = & λ \cdot β_{•} \cdot {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ - 1} + (1 - λ) \cdot β_{•} \cdot {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} - β_{•} < 0 and \\ ϕ_{λ}^{''} (y) & = & - {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} \cdot \frac{λ (1 - λ) \cdot β_{•}^{2} \cdot {(α_{A} - α_{H})}^{2}}{{(α_{A} + β_{•} \cdot y)}^{2} (α_{H} + β_{•} \cdot y)} > 0, \end{matrix}

(A17)

which immediately implies

{lim}_{y \to \infty} ϕ_{λ} (y) = {lim}_{y \to \infty} ϕ_{λ}^{'} (y) = {lim}_{y \to \infty} ϕ_{λ}^{''} (y) = 0

, and by means of l’Hospital’s rule

\begin{matrix} lim_{y \to \infty} y \cdot ϕ_{λ} (y) = lim_{y \to \infty} - y^{2} \cdot ϕ_{λ}^{'} (y) = lim_{y \to \infty} \frac{y^{3}}{2} \cdot ϕ_{λ}^{''} (y) \\ = - \frac{1}{2} lim_{y \to \infty} {(\frac{α_{A} + β_{•} \cdot y}{α_{H} + β_{•} \cdot y})}^{λ} \cdot \frac{λ (1 - λ) \cdot β_{•}^{2} \cdot {(α_{A} - α_{H})}^{2}}{{(α_{A} / y + β_{•})}^{2} (α_{H} / y + β_{•})} = - \frac{1}{2} λ (1 - λ) \cdot \frac{{(α_{A} - α_{H})}^{2}}{β_{•}} . \end{matrix}

(A18)

The Formulas (A13), (A15), (A17) imply the limits

{lim}_{y \to \infty} p_{y} = α_{λ}

,

{lim}_{y \to \infty} q_{y} = β_{•}

and

{lim}_{y \to \infty} {\underset{̲}{x}}_{0}^{(q_{y})} = 0

iff

β_{•} \leq 1

. The latter is due to the fact that for

β_{•} > 1

one gets with (A15)

{lim}_{y \to \infty} {\underset{̲}{x}}_{0}^{(q_{y})} = \frac{e^{β_{•}}}{β_{•}} [(1 - β_{•}) - \sqrt{{(1 - β_{•})}^{2}}] = \frac{e^{β_{•}}}{β_{•}} [2 - 2 β_{•}] \neq 0

. In the following, let us assume

β_{•} < 1

(the reason why we exclude the case

β_{•} = 1

is explained below). One gets

{lim}_{y \to \infty} h (y) \geq {lim}_{y \to \infty} \underset{̲}{h} (y) = 0

. Since we have to prove that

\underset{̲}{h} (y) > 0

for some

y > y_{1}

, it is sufficient to show that

{lim}_{y \to \infty} y \cdot \underset{̲}{h} (y) > 0

. To verify the latter, we first derive with l’Hospital’s rule and with (A17), (A18)

\begin{matrix} lim_{y \to \infty} y \cdot (1 - e^{{\underset{̲}{x}}_{0}^{(q_{y})}}) = lim_{y \to \infty} y^{2} \cdot e^{{\underset{̲}{x}}_{0}^{(q_{y})}} \cdot (\frac{\partial}{\partial y} {\underset{̲}{x}}_{0}^{(q_{y})}) \\ = lim_{y \to \infty} {y^{2} \cdot \frac{- e^{β_{•}} \cdot ϕ_{λ}^{''} (y)}{{(ϕ_{λ}^{'} (y) + β_{•})}^{2}} \cdot [(1 - q_{y}) - \sqrt{{(1 - q_{y})}^{2} - 2 e^{- β_{•}} q_{y} (q_{y} - β_{•})}] \\ + \frac{e^{β_{•}}}{q_{y}} \cdot [- y^{2} \cdot ϕ_{λ}^{''} (y) - \frac{- 2 y^{2} ϕ_{λ}^{''} (y) (1 - q_{y}) - 2 y^{2} ϕ_{λ}^{''} (y) e^{- β_{•}} q_{y} - 2 y^{2} ϕ_{λ}^{''} (y) e^{- β_{•}} ϕ_{λ}^{'} (y)}{2 \cdot \sqrt{{(1 - q_{y})}^{2} - 2 e^{- β_{•}} q_{y} (q_{y} - β_{•})}}]} \\ = 0 . \end{matrix}

(A19)

Notice that without further examination this limit would not necessarily hold for

β_{•} = 1

, since then the denominator in (A19) converges to zero. With (A13), (A16), (A18) and (A19) we finally obtain

\begin{matrix} lim_{y \to \infty} y \cdot \underset{̲}{h} (y) & = & lim_{y \to \infty} \{(y \cdot ϕ_{λ} (y) - y^{2} \cdot ϕ_{λ}^{'} (y)) \cdot e^{{\underset{̲}{x}}_{0}^{(q_{y})}} - y \cdot (1 - e^{{\underset{̲}{x}}_{0}^{(q_{y})}}) α_{λ}\} \\ = & - λ (1 - λ) \frac{{(α_{A} - α_{H})}^{2}}{β_{•}} > 0 . \end{matrix}

(A20)

Let us now consider the case

α_{λ} < 0

. The proof works out almost completely analogous to the case

α_{λ} \geq 0

. We indicate the main differences. Since

p_{y} ↘ α_{λ} < 0

and

q_{y} ↗ β_{•} \in] 0, 1 [

for

y \to \infty

, there is a sufficiently large

y_{2} > y_{1}

, such that

p_{y} < 0

and

q_{y} > 0

. Thus,

{\bar{Q}}_{λ}^{(q_{y})} (x) : = \frac{q_{y}}{2} \cdot x^{2} + q_{y} \cdot x + q_{y} - β_{•} \geq ξ_{λ}^{(q_{y})} (x) = q_{y} e^{x} - β_{•} for x \in] - \infty, 0] .

The corresponding (existing) smaller solution of

{\bar{Q}}_{λ}^{(q_{y})} (x) = x

is

{\bar{x}}_{0}^{(q_{y})} = \frac{1}{q_{y}} [(1 - q_{y}) - \sqrt{{(1 - q_{y})}^{2} - 2 q_{y} (q_{y} - β_{•})}],

having the same form as the solution (A15) with

e^{- β_{•}}

substituted by 1. Notice that there clearly holds

x_{0}^{(q_{y})} < {\bar{x}}_{0}^{(q_{y})} < 0

. However, since

p_{y} < 0

, we now get

h (y) = p_{y} \cdot e^{x_{0}^{(q_{y})}} - α_{λ} \geq p_{y} \cdot e^{{\bar{x}}_{0}^{(q_{y})}} - α_{λ} = : \underset{̲}{h} (y)

, as in (A16). Since all calculations (A17) to (A20) remain valid (with

e^{- β_{•}}

substituted by 1), this proof is finished. □

Appendix A.2. Proofs and Auxiliary Lemmas for Section 5

We start with two lemmas which will be useful for the proof of Theorem 3. They deal with the sequence

{(a_{n}^{(q_{λ})})}_{n \in N}

from (36).

Lemma A2.

For arbitrarily fixed parameter constellation

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times] 0, 1 [

, suppose that

q_{λ} > 0

and

{lim}_{λ ↗ 1} q_{λ} = β_{A}

holds. Then one gets the limit

\forall n \in N : lim_{λ ↗ 1} a_{n}^{(q_{λ})} = 0 .

(A21)

Proof.

This can be easily seen by induction: for

n = 1

there clearly holds

lim_{λ ↗ 1} a_{1}^{(q_{λ})} = lim_{λ ↗ 1} (q_{λ} - β_{λ}) = β_{A} - β_{A} = 0 .

Assume now that

{lim}_{λ ↗ 1} a_{k}^{(q_{λ})} = 0

holds for all

k \in N

,

k \leq n - 1

, then

lim_{λ ↗ 1} a_{n}^{(q_{λ})} = lim_{λ ↗ 1} (q_{λ} \cdot e^{a_{n - 1}^{(q_{λ})}} - β_{λ}) = β_{A} \cdot 1 - β_{A} = 0 . □

Lemma A3.

In addition to the assumptions of Lemma A2, suppose that

λ \mapsto q_{λ}

is continuously differentiable on

] 0, 1 [

and that the limit

l : = {lim}_{λ ↗ 1} \frac{\partial q_{λ}}{\partial λ}

is finite. Then, for all

n \in N

one obtains

lim_{λ ↗ 1} \frac{\partial a_{n}^{(q_{λ})}}{\partial λ} = u_{n} : = \{\begin{matrix} \frac{l + β_{H} - β_{A}}{1 - β_{A}} \cdot (1 - {(β_{A})}^{n}), & if β_{A} \neq 1, \\ n \cdot (l + β_{H} - 1), & if β_{A} = 1, \end{matrix}

(A22)

which is the unique solution of the linear recursion equation

u_{n} = l + β_{H} - β_{A} + β_{A} \cdot u_{n - 1}, u_{0} = 0 .

(A23)

Furthermore, for all

n \in N

there holds

\sum_{k = 1}^{n} lim_{λ ↗ 1} \frac{\partial a_{k}^{(q_{λ})}}{\partial λ} = \sum_{k = 1}^{n} u_{k} = \{\begin{matrix} \frac{l + β_{H} - β_{A}}{1 - β_{A}} \cdot [n - \frac{β_{A}}{1 - β_{A}} (1 - {(β_{A})}^{n})], & if β_{A} \neq 1, \\ \frac{n \cdot (n + 1)}{2} \cdot (l + β_{H} - 1), & if β_{A} = 1 . \end{matrix}

Proof.

Clearly,

u_{n}

defined by (A22) is the unique solution of (A23). We prove by induction that

{lim}_{λ ↗ 1} \frac{\partial a_{n}^{(q_{λ})}}{\partial λ} = u_{n}

holds. For

n = 1

one gets

lim_{λ ↗ 1} \frac{\partial a_{1}^{(q_{λ})}}{\partial λ} = lim_{λ ↗ 1} \frac{\partial (q_{λ} - β_{λ})}{\partial λ} = l - (β_{A} - β_{H}) = u_{1} .

Suppose now that (A22) holds for all

k \in N

,

k \leq n - 1

. Then, by incorporating (A21) we obtain

\begin{matrix} lim_{λ ↗ 1} \frac{\partial a_{n}^{(q_{λ})}}{\partial λ} & = & lim_{λ ↗ 1} \frac{\partial}{\partial λ} (q_{λ} \cdot e^{a_{n - 1}^{(q_{λ})}} - β_{λ}) = lim_{λ ↗ 1} e^{a_{n - 1}^{(q_{λ})}} \cdot (\frac{\partial q_{λ}}{\partial λ} + q_{λ} \frac{\partial a_{n - 1}^{(q_{λ})}}{\partial λ}) - (β_{A} - β_{H}) \\ = & l - (β_{A} - β_{H}) + β_{A} \cdot u_{n - 1} = u_{n} . \end{matrix}

The remaining assertions follow immediately. □

We are now ready to give the

Proof of Theorem 3.

(a) Recall that for the setup

(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{NI} \cup P_{SP, 1})

we chose the intercept as

p_{λ} : = p_{λ}^{E} : = α_{A}^{λ} α_{H}^{1 - λ}

and the slope as

q_{λ} : = q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ}

, which in (39) lead to the exact value

V_{λ, X_{0}, n}

of the Hellinger integral. Because of

\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ} = 0

as well as

{lim}_{λ ↗ 1} q_{λ} = β_{A}

, we obtain by using (38) and Lemma A2 for all

X_{0} \in N

and for all

n \in N

lim_{λ ↗ 1} V_{λ, X_{0}, n} : = lim_{λ ↗ 1} exp \{a_{n}^{(q_{λ})} \cdot X_{0} + \sum_{k = 1}^{n} b_{k}^{(p_{λ}, q_{λ})}\} = lim_{λ ↗ 1} exp \{a_{n}^{(q_{λ})} \cdot X_{0} + \frac{α_{A}}{β_{A}} \sum_{k = 1}^{n} a_{k}^{(q_{λ})}\} = 1,

which leads by (68) to

\begin{matrix} I (P_{A, n} ∥ P_{H, n}) & = & lim_{λ ↗ 1} \frac{1 - H_{λ} (P_{A, n} ∥ P_{H, n})}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} \frac{1 - V_{λ, X_{0}, n}}{λ \cdot (1 - λ)} \\ = & lim_{λ ↗ 1} \frac{- V_{λ, X_{0}, n}}{1 - 2 λ} \cdot \frac{\partial}{\partial λ} [a_{n}^{(q_{λ})} \cdot X_{0} + \frac{p_{λ}}{q_{λ}} \sum_{k = 1}^{n} a_{k}^{(q_{λ})}] \\ = & lim_{λ ↗ 1} [\frac{\partial a_{n}^{(q_{λ})}}{\partial λ} \cdot X_{0} + (\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}}) \cdot \sum_{k = 1}^{n} a_{k}^{(q_{λ})} + \frac{p_{λ}}{q_{λ}} \cdot \sum_{k = 1}^{n} \frac{\partial a_{k}^{(q_{λ})}}{\partial λ}] . \end{matrix}

(A24)

For further analysis, we use the obvious derivatives

\frac{\partial p_{λ}}{\partial λ} = p_{λ} log (\frac{α_{A}}{α_{H}}), \frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}} = \frac{p_{λ}}{q_{λ}} log (\frac{α_{A} β_{H}}{α_{H} β_{A}}), \frac{\partial q_{λ}}{\partial λ} = q_{λ} log (\frac{β_{A}}{β_{H}}),

(A25)

where the subcase

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI}

(with

p_{λ} \equiv 0

) is consistently covered. From (A25) and Lemma A3 we deduce

lim_{λ ↗ 1} \frac{\partial a_{n}^{(q_{λ})}}{\partial λ} \cdot X_{0} = \{\begin{matrix} (β_{A} log (\frac{β_{A}}{β_{H}}) - (β_{A} - β_{H})) \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot X_{0}, & if β_{A} \neq 1, \\ n \cdot (β_{A} log (\frac{β_{A}}{β_{H}}) - (β_{A} - β_{H})) \cdot X_{0}, & if β_{A} = 1, \end{matrix}

and by means of (A21)

\forall n \in N : lim_{λ ↗ 1} [(\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}}) \cdot \sum_{k = 1}^{n} a_{k}^{(q_{λ})}] = 0 .

For the last expression in (A24) we again apply Lemma A3 to end up with

lim_{λ ↗ 1} \frac{p_{λ}}{q_{λ}} \cdot \sum_{k = 1}^{n} \frac{\partial}{\partial λ} a_{k}^{(q_{λ})} = \{\begin{matrix} \frac{α_{A} \cdot [β_{A} log (\frac{β_{A}}{β_{H}}) - (β_{A} - β_{H})]}{β_{A} (1 - β_{A})} \cdot [n - \frac{β_{A}}{1 - β_{A}} (1 - {(β_{A})}^{n})], & if β_{A} \neq 1, \\ n \cdot (n + 1) \frac{α_{A}}{2 β_{A}} \cdot [β_{A} log (\frac{β_{A}}{β_{H}}) - (β_{A} - β_{H})], & if β_{A} = 1, \end{matrix}

(A26)

which finishes the proof of part (a). To show part (b), for the corresponding setup

(β_{A}, β_{H}, α_{A}, α_{H})

\in P_{SP} \ P_{SP, 1}

let us first choose – according to (45) in Section 3.4—the intercept as

p_{λ} : = p_{λ}^{L} : = α_{A}^{λ} α_{H}^{1 - λ}

and the slope as

q_{λ} : = q_{λ}^{L} : = β_{A}^{λ} β_{H}^{1 - λ}

, which in part (b) of Proposition 6 lead to the lower bounds

B_{λ, X_{0}, n}^{L}

of the Hellinger integral. This is formally the same choice as in part (a) satisfying

{lim}_{λ ↗ 1} p_{λ} = α_{A}

,

{lim}_{λ ↗ 1} q_{λ} = β_{A}

but in contrast to (a) we now have

\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ} \neq 0

but nevertheless

lim_{λ ↗ 1} \frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ} = 0 .

From this, (38), part (b) of Proposition 6 and Lemma A2 we obtain

lim_{λ ↗ 1} B_{λ, X_{0}, n}^{L} = lim_{λ ↗ 1} exp \{a_{n}^{(q_{λ})} \cdot X_{0} + \frac{p_{λ}}{q_{λ}} \sum_{k = 1}^{n} a_{k}^{(q_{λ})} + n \cdot (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ})\} = 1

(A27)

and hence

\begin{matrix} I (P_{A, n} ∥ P_{H, n}) & \leq & lim_{λ ↗ 1} \frac{1 - B_{λ, X_{0}, n}^{L}}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} \frac{- B_{λ, X_{0}, n}^{L}}{1 - 2 λ} \cdot \frac{\partial}{\partial λ} [a_{n}^{(q_{λ})} X_{0} + \frac{p_{λ}}{q_{λ}} \sum_{k = 1}^{n} a_{k}^{(q_{λ})} + n (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ})] \\ = & lim_{λ ↗ 1} [\frac{\partial a_{n}^{(q_{λ})}}{\partial λ} X_{0} + (\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}}) \sum_{k = 1}^{n} a_{k}^{(q_{λ})} + \frac{p_{λ}}{q_{λ}} \sum_{k = 1}^{n} \frac{\partial a_{k}^{(q_{λ})}}{\partial λ} + n \frac{\partial}{\partial λ} (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ})] . \end{matrix}

(A28)

In the current setup, the first three expressions in (A28) can be evaluated in exactly the same way as in (A25) to (A26), and for the last expression one has the limit

\begin{matrix} \frac{\partial}{\partial λ} (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ}) & = & \frac{p_{λ}}{q_{λ}} log (\frac{α_{A} β_{H}}{α_{H} β_{A}}) \cdot β_{λ} + \frac{p_{λ}}{q_{λ}} \cdot (β_{A} - β_{H}) - (α_{A} - α_{H}) \\ \overset{λ ↗ 1}{⟶} & α_{A} [log (\frac{α_{A} β_{H}}{α_{H} β_{A}}) - \frac{β_{H}}{β_{A}}] + α_{H}, \end{matrix}

which finishes the proof of part (b). □

Proof of Theorem 4.

Let us fix

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}

,

X_{0} \in N

,

n \in N

and

y \in [0, \infty [

. The lower bound

E_{y, X_{0}, n}^{L, t a n}

of the Kullback-Leibler information divergence (relative entropy) is derived by using

ϕ_{λ}^{U} \equiv ϕ_{λ, y}^{\tan}

(cf. (52)), which corresponds to the tangent line of

ϕ_{λ}

at y, as a linear upper bound for

ϕ_{λ}

(

λ \in] 0, 1 [

). More precisely, one gets

ϕ_{λ}^{U} (x) : = (p_{λ}^{U} - α_{λ}) + (q_{λ}^{U} - β_{λ}) x

(

x \in [0, \infty [

) with

p_{λ} : = p_{λ} (y) : = ϕ_{λ} (y) - y ϕ_{λ}^{'} (y) + α_{λ}

and

q_{λ} : = q_{λ} (y) : = ϕ_{λ}^{'} (y) + β_{λ}

, implying

q_{λ} > 0

because of Properties 3 (P17). Analogously to (A27) and (A28), we obtain from (38) and (40) the convergence

{lim}_{λ ↗ 1} B_{λ, X_{0}, n}^{U} = 1

and thus

I (P_{A, n} ∥ P_{H, n}) \geq lim_{λ ↗ 1} [\frac{\partial a_{n}^{(q_{λ})}}{\partial λ} X_{0} + (\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}}) \sum_{k = 1}^{n} a_{k}^{(q_{λ})} + \frac{p_{λ}}{q_{λ}} \sum_{k = 1}^{n} \frac{\partial a_{k}^{(q_{λ})}}{\partial λ} + n \frac{\partial}{\partial λ} (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ})] .

(A29)

As before, we compute the involved derivatives. From (30) to (32) as well as (P17) we get

\begin{matrix} \frac{\partial p_{λ}}{\partial λ} & = & {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ} f_{H} (y) log (\frac{f_{A} (y)}{f_{H} (y)}) - β_{A} y {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ - 1} - λ β_{A} y {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ - 1} log (\frac{f_{A} (y)}{f_{H} (y)}) \\ + β_{H} y {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ} - (1 - λ) β_{H} y {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ} log (\frac{f_{A} (y)}{f_{H} (y)}) \\ \overset{λ ↗ 1}{⟶} & α_{A} log (\frac{f_{A} (y)}{f_{H} (y)}) + \frac{y \cdot (α_{A} β_{H} - α_{H} β_{A})}{f_{H} (y)}, \end{matrix}

(A30)

and

\begin{matrix} \frac{\partial q_{λ}}{\partial λ} & = & β_{A} {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ - 1} + λ β_{A} {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ - 1} log (\frac{f_{A} (y)}{f_{H} (y)}) - β_{H} {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ} \\ + (1 - λ) β_{H} {(\frac{f_{A} (y)}{f_{H} (y)})}^{λ} log (\frac{f_{A} (y)}{f_{H} (y)}) \\ \overset{λ ↗ 1}{⟶} & β_{A} (1 + log (\frac{f_{A} (y)}{f_{H} (y)})) - β_{H} \frac{f_{A} (y)}{f_{H} (y)} = : l . \end{matrix}

(A31)

Combining these two limits we get

\begin{matrix} \frac{\partial}{\partial λ} (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ}) & = & \frac{q_{λ} (\frac{\partial p_{λ}}{\partial λ}) - p_{λ} (\frac{\partial q_{λ}}{\partial λ})}{{(q_{λ})}^{2}} \cdot β_{λ} + \frac{p_{λ}}{q_{λ}} \cdot (β_{A} - β_{H}) - (α_{A} - α_{H}) \\ \overset{λ ↗ 1}{⟶} & [\frac{y \cdot (α_{A} β_{H} - α_{H} β_{A})}{f_{H} (y)} - α_{A} (1 - \frac{β_{H} f_{A} (y)}{β_{A} f_{H} (y)})] + α_{H} - \frac{α_{A} β_{H}}{β_{A}} . \\ = & (α_{H} - α_{A} \frac{β_{H}}{β_{A}}) (1 - \frac{f_{A} (y)}{f_{H} (y)}) . \end{matrix}

(A32)

The above calculation also implies that

{lim}_{λ ↗ 1} (\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}})

is finite and thus

{lim}_{λ ↗ 1} (\frac{\partial}{\partial λ} \frac{p_{λ}}{q_{λ}}) \sum_{k = 1}^{n} a_{k}^{(q_{λ})} = 0

by means of Lemma A2. The proof of

I (P_{A, n} ∥ P_{H, n}) \geq E_{y, X_{0}, n}^{L, \tan}

is finished by using Lemma A3 with l defined in (A31) and by plugging the limits (A30) to (A32) in (A29).

To derive the lower bound

E_{k, X_{0}, n}^{L, \sec}

(cf. (73)) for fixed

k \in N_{0}

, we use as a linear upper bound

ϕ_{λ}^{U}

for

ϕ_{λ} (\cdot)

(

λ \in] 0, 1 [

) the secant line

ϕ_{λ, k}^{\sec}

(cf. (53)) of

ϕ_{λ}

across its arguments k and

k + 1

, corresponding to the choices

p_{λ} : = p_{λ, k}^{\sec} = (k + 1) \cdot ϕ_{λ} (k) - k \cdot ϕ_{λ} (k + 1) + α_{λ}

and

q_{λ} : = q_{λ, k}^{\sec} : = ϕ_{λ} (k + 1) - ϕ_{λ} (k) + β_{λ}

, implying

q_{λ} > 0

because of Properties 3 (P18). As a side remark, notice that this

ϕ_{λ}^{U} (x)

may become positive for some

x \in [0, \infty [

(which is not always consistent with Goal (G1) for fixed

λ

, but leads to a tractable limit bound as

λ

tends to 1). Analogously to (A27) and (A28) we get again

{lim}_{λ ↗ 1} B_{λ, X_{0}, n}^{U} = 1

, which leads to the lower bound given in (A29) with appropriately plugged-in quantities. As in the above proof of the lower bound

E_{y, X_{0}, n}^{L, t a n}

, the inequality

I (P_{A, n} ∥ P_{H, n}) \geq E_{k, X_{0}, n}^{L, \sec}

follows straightforwardly from Lemma A2, Lemma A3 and the three limits

\begin{matrix} \frac{\partial p_{λ}}{\partial λ} = {(\frac{f_{A} (k)}{f_{H} (k)})}^{λ} f_{H} (k) \cdot (k + 1) log (\frac{f_{A} (k)}{f_{H} (k)}) - {(\frac{f_{A} (k + 1)}{f_{H} (k + 1)})}^{λ} f_{H} (k + 1) \cdot k log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) \\ \overset{λ ↗ 1}{⟶} f_{A} (k) (k + 1) log (\frac{f_{A} (k)}{f_{H} (k)}) - f_{A} (k + 1) k log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}), \\ \frac{\partial q_{λ}}{\partial λ} = {(\frac{f_{A} (k + 1)}{f_{H} (k + 1)})}^{λ} f_{H} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - {(\frac{f_{A} (k)}{f_{H} (k)})}^{λ} f_{H} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) \\ \overset{λ ↗ 1}{⟶} f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) - f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) = : l, and \\ \frac{\partial}{\partial λ} (\frac{p_{λ}}{q_{λ}} β_{λ} - α_{λ}) = \frac{q_{λ} (\frac{\partial p_{λ}}{\partial λ}) - p_{λ} (\frac{\partial q_{λ}}{\partial λ})}{{(q_{λ})}^{2}} \cdot β_{λ} + \frac{p_{λ}}{q_{λ}} \cdot (β_{A} - β_{H}) - (α_{A} - α_{H}) \\ \overset{λ ↗ 1}{⟶} f_{A} (k) log (\frac{f_{A} (k)}{f_{H} (k)}) (k + 1 + \frac{α_{A}}{β_{A}}) - f_{A} (k + 1) log (\frac{f_{A} (k + 1)}{f_{H} (k + 1)}) (k + \frac{α_{A}}{β_{A}}) - \frac{α_{A} β_{H}}{β_{A}} + α_{H} . \end{matrix}

To construct the third lower bound

E_{X_{0}, n}^{L, h o r}

(cf. (74)), we start by using the horizontal line

ϕ_{λ}^{hor} (\cdot)

(cf. (54)) as an upper bound of

ϕ_{λ}

. For each fixed

λ \in] 0, 1 [

, it is defined by the intercept

{sup}_{x \in N_{0}} ϕ_{λ} (x)

. On

P_{SP, 3 a} \cup P_{SP, 3 b}

, this supremum is achieved at the finite integer point

z_{λ}^{*} : = arg {max}_{x \in N_{0}} ϕ_{λ} (x)

(since

{lim}_{x \to \infty} ϕ_{λ} (x) = - \infty

) and there holds

ϕ_{λ} (z_{λ}^{*}) < 0

which leads with the parameters

q_{λ} = β_{λ}

,

p_{λ} = ϕ_{λ} (z_{λ}^{*}) + α_{λ}

to the Hellinger integral upper bound

B_{λ, X_{0}, n}^{U} = exp \{ϕ_{λ} (z_{λ}^{*}) \cdot n\} < 1

(cf. Remark 1 (b)). We strive for computing the limit

{lim}_{λ ↗ 1} \frac{1 - B_{λ, X_{0}, n}^{U}}{λ (1 - λ)}

, which is not straightforward to solve since in general it seems to be intractable to express

z_{λ}^{*}

explicitly in terms of

λ

. To circumvent this problem, we notice that it is sufficient to determine

z_{λ}^{*}

in a small

ϵ -

environment

] 1 - ϵ, 1 [

. To accomplish this, we incorporate

{lim}_{λ ↗ 1} ϕ_{λ} (x) = 0

for all

x \in [0, \infty [

and calculate by using l’Hospital’s rule

lim_{λ ↗ 1} \frac{ϕ_{λ} (x)}{1 - λ} = (α_{A} + β_{A} x) [- log (\frac{α_{A} + β_{A} x}{α_{H} + β_{H} x}) + 1] - (α_{H} + β_{H} x) .

Accordingly, let us define

z^{*} : = arg {max}_{x \in N_{0}} \{(α_{A} + β_{A} x) [- log (\frac{α_{A} + β_{A} x}{α_{H} + β_{H} x}) + 1] - (α_{H} + β_{H} x)\}

(note that the maximum exists since

{lim}_{x \to \infty} \{(α_{A} + β_{A} x) [- log (\frac{α_{A} + β_{A} x}{α_{H} + β_{H} x}) + 1] - (α_{H} + β_{H} x)\}

= - \infty

). Due to continuity of the function

(λ, x) \mapsto \frac{ϕ_{λ} (x)}{1 - λ}

, there exists an

ϵ > 0

such that for all

λ \in] 1 - ϵ, 1 [

there holds

z_{λ}^{*} = z^{*}

. Applying these considerations, we get with l’Hospital’s rule

I (P_{A, n} ∥ P_{H, n}) \geq lim_{λ ↗ 1} \frac{1 - exp \{ϕ_{λ} (z^{*}) \cdot n\}}{λ (1 - λ)} = [f_{A} (z^{*}) \cdot [log (\frac{f_{A} (z^{*})}{f_{H} (z^{*})}) - 1] + f_{H} (z^{*})] \cdot n \geq 0 .

(A33)

In fact, for the current parameter constellation

P_{SP, 3 a} \cup P_{SP, 3 b}

we have

ϕ_{λ} (x) < 0

for all

λ \in] 0, 1 [

and all

x \in N_{0}

which implies

f_{A} (z^{*}) \neq f_{H} (z^{*})

by Lemma A1; thus, we even get

E_{X_{0}, n}^{L, h o r} > 0

for all

n \in N

by virtue of the inequality

- log (\frac{f_{H} (z^{*})}{f_{A} (z^{*})}) > - \frac{f_{H} (z^{*})}{f_{A} (z^{*})} + 1

.

For the case

P_{SP, 2}

, the above-mentioned procedure leads to

z_{λ}^{*} = 0 = z^{*}

(

λ \in] 0, 1 [

) which implies

ϕ_{λ} (z_{λ}^{*}) = 0

,

B_{λ, X_{0}, n}^{U} \equiv 1

and thus the trivial lower bound

E_{X_{0}, n}^{L, h o r} = {lim}_{λ ↗ 1} \frac{1 - B_{λ, X_{0}, n}^{U}}{λ (1 - λ)} = 0

follows for all

n \in N

. In contrast, for the case

P_{SP, 3 c}

one gets

z_{λ}^{*} = \frac{α_{A} - α_{H}}{β_{H} - β_{A}} = z^{*}

(

λ \in] 0, 1 [

) which nevertheless also implies

ϕ_{λ} (z_{λ}^{*}) = 0

and hence

E_{X_{0}, n}^{L, h o r} \equiv 0

. On

P_{SP, 4}

, we have

{sup}_{x \in N_{0}} ϕ_{λ} (x) = {lim}_{x \to \infty} ϕ_{λ} (x) = 0

and hence we set

E_{X_{0}, n}^{L, h o r} \equiv 0

.

To show the strict positivity

E_{X_{0}, n}^{L} > 0

in the parameter case

P_{SP, 2}

, we inspect the bound

E_{0, X_{0}, n}^{L, s e c}

. With

α : = α_{•} : = α_{A} = α_{H}

(the bullet will be omitted in this proof) and the auxiliary variable

x : = \frac{β_{H}}{β_{A}} > 0

, the definition (73) respectively its special case (76) rewrites for all

n \in N

as

E_{0, X_{0}, n}^{L, s e c} : = E_{0, X_{0}, n}^{L, s e c} (x) : = \{\begin{matrix} [- (α + β_{A}) \cdot log (\frac{α + β_{A} x}{α + β_{A}}) + β_{A} (x - 1)] \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α}{1 - β_{A}}] \\ + [\frac{α}{β_{A} (1 - β_{A})} (- (α + β_{A}) \cdot log (\frac{α + β_{A} x}{α + β_{A}}) + β_{A} (x - 1)) \\ + \frac{α}{β_{A}} (α + β_{A}) \cdot log (\frac{α + β_{A} x}{α + β_{A}}) - α (x - 1)] \cdot n, & if β_{A} \neq 1, \\ [- (α + 1) \cdot log (\frac{α + x}{α + 1}) + x - 1] \cdot [\frac{α}{2} \cdot n^{2} + (X_{0} + \frac{α}{2}) \cdot n] \\ + [(α + 1) \cdot log (\frac{α + x}{α + 1}) - x + 1] \cdot α \cdot n, & if β_{A} = 1 . \end{matrix}

(A34)

To prove that

E_{0, X_{0}, n}^{L, s e c} > 0

for all

X_{0} \in N

and all

n \in N

it suffices to show that

E_{0, X_{0}, n}^{L, s e c} (1) = (\frac{\partial}{\partial x} E_{0, X_{0}, n}^{L, s e c}) (1) = 0

and

(\frac{\partial^{2}}{\partial x^{2}} E_{0, X_{0}, n}^{L, s e c}) (x) > 0

for all

x \in] 0, \infty [\ {1}

. The assertion

E_{0, X_{0}, n}^{L, s e c} (1) = 0

is trivial from (A34). Moreover, we obtain

(\frac{\partial}{\partial x} E_{0, X_{0}, n}^{L, s e c}) (x) = \{\begin{matrix} β_{A} \cdot [1 - \frac{α + β_{A}}{α + β_{A} x}] \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α}{1 - β_{A}}] \\ + α \cdot (1 - \frac{α + β_{A}}{α + β_{A} x}) \cdot \frac{β_{A}}{1 - β_{A}} \cdot n, & if β_{A} \neq 1, \\ [1 - \frac{α + 1}{α + x}] \cdot [\frac{α}{2} \cdot n^{2} + (X_{0} - \frac{α}{2}) \cdot n], & if β_{A} = 1, \end{matrix}

which immediately yields

(\frac{\partial}{\partial x} E_{0, X_{0}, n}^{L, s e c}) (1) = 0

. For the second derivative we get

(\frac{\partial^{2}}{\partial x^{2}} E_{0, X_{0}, n}^{L, s e c}) (x) = \{\begin{matrix} \frac{(α + β_{A}) \cdot β_{A}^{2}}{{(α + β_{A} x)}^{2}} \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α}{1 - β_{A}}] \\ + α \frac{α + β_{A}}{{(α + β_{A} x)}^{2}} \cdot \frac{β_{A}^{2}}{1 - β_{A}} \cdot n > 0, & if β_{A} \neq 1, \\ \frac{α + 1}{{(α + x)}^{2}} \cdot [\frac{α}{2} \cdot n^{2} + (X_{0} - \frac{α}{2}) \cdot n] > 0, & if β_{A} = 1, \end{matrix}

(A35)

where the strict positivity of

E_{0, X_{0}, n}^{L, s e c}

in the case

β_{A} \neq 1

follows immediately by replacing

X_{0}

with 0 and by using the obvious relation

\frac{1}{1 - β_{A}} \cdot [n - \frac{1 - β_{A}^{n}}{1 - β_{A}}] = \frac{1}{1 - β_{A}} \sum_{k = 0}^{n - 1} (1 - β_{A}^{k}) > 0

. The strict positivity in the case

β_{A} = 1

is trivial by inspection.

For the constellation

P_{SP, 4}

with parameters

β : = β_{•} : = β_{A} = β_{H}

,

α_{A} \neq α_{H}

, the strict positivity of

E_{X_{0}, n}^{L} > 0

follows by showing that

E_{y, X_{0}, n}^{L, t a n}

converges from above to zero as y tends to infinity. This is done by proving

{lim}_{y \to \infty} y \cdot E_{y, X_{0}, n}^{L, t a n} \in] 0, \infty [

. To see this, let us first observe that by l’Hospital’s rule we get

lim_{y \to \infty} y \cdot log (\frac{α_{A} + β y}{α_{H} + β y}) = \frac{α_{A} - α_{H}}{β} as well as lim_{y \to \infty} y \cdot (1 - \frac{α_{A} + β y}{α_{H} + β y}) = - \frac{α_{A} - α_{H}}{β} .

From this and (72), we obtain

{lim}_{y \to \infty} y \cdot E_{y, X_{0}, n}^{L, t a n} = \frac{{(α_{A} - α_{H})}^{2}}{β} \cdot n > 0

in both cases

β \neq 1

and

β = 1

.

Finally, for the parameter case

P_{SP, 3 c}

we consider the bound

E_{y^{*}, X_{0}, n}^{L, t a n}

, with

y^{*} = \frac{α_{A} - α_{H}}{β_{H} - β_{A}}

. Since

α_{A} + β_{A} y^{*} = α_{H} + β_{H} y^{*}

, it is easy to see that

E_{y^{*}, X_{0}, n}^{L, t a n} = 0

for all

n \in N

. However, the condition

(\frac{\partial}{\partial y} E_{y, X_{0}, n}^{L, t a n}) (y^{*}) \neq 0

implies that

{sup}_{y \geq 0} E_{y, X_{0}, n}^{L, t a n} > 0

. The explicit form (75) of this condition follows from

(\frac{\partial}{\partial y} E_{y, X_{0}, n}^{L, t a n}) (y) = \{\begin{matrix} \frac{{(α_{A} β_{H} - α_{H} β_{A})}^{2}}{f_{A} (y) {(f_{H} (y))}^{2}} \cdot \frac{1 - {(β_{A})}^{n}}{1 - β_{A}} \cdot [X_{0} - \frac{α_{A}}{1 - β_{A}}] \\ + \frac{α_{A} β_{H} - α_{H} β_{A}}{{(f_{H} (y))}^{2}} \cdot [\frac{α_{A}}{β_{A} (1 - β_{A}) f_{A} (y)} - \frac{α_{A} β_{H} - α_{H} β_{A}}{β_{A}}] \cdot n, & if β_{A} \neq 1, \\ \frac{{(α_{A} β_{H} - α_{H})}^{2}}{f_{A} (y) {(f_{H} (y))}^{2}} \cdot [\frac{α_{A}}{2} \cdot n^{2} + (X_{0} + \frac{α_{A}}{2}) \cdot n] - \frac{{(α_{A} β_{H} - α_{H})}^{2}}{{(f_{H} (y))}^{2}} \cdot n, & if β_{A} = 1, \end{matrix}

y \geq 0

, by using the particular choice

y = y^{*}

together with

f_{A} (y^{*}) = f_{H} (y^{*}) = - \frac{α_{A} β_{H} - α_{H} β_{A}}{β_{A} - β_{H}}

. □

Appendix A.3. Proofs and Auxiliary Lemmas for Section 6

Proof of Lemma 2.

A closed-form representation of a sequence

{({\tilde{a}}_{n})}_{n \in N_{0}}

defined in (83) to (85) is given by the formula

{\tilde{a}}_{n} = \sum_{k = 0}^{n - 1} (c + ρ_{k}) d^{n - 1 - k} .

(A36)

This can be seen by induction: from (83) we obtain with

{\tilde{a}}_{0} = 0

for the first element

{\tilde{a}}_{1} = c + ρ_{0} = \sum_{k = 0}^{0} (c + ρ_{k}) d^{- k}

. Supposing that (A36) holds for the n-th element, the induction step is

{\tilde{a}}_{n + 1} = c + d \cdot {\tilde{a}}_{n} + ρ_{n} = c + d \cdot \sum_{k = 0}^{n - 1} (c + ρ_{k}) d^{n - 1 - k} + ρ_{n} = \sum_{k = 0}^{n} (c + ρ_{k}) d^{n - k} .

In order to obtain the explicit representation of

{\tilde{a}}_{n}

, we consider first the case

0 \leq ν < ϰ < d

and

ρ_{n} = K_{1} \cdot ϰ^{n} + K_{2} \cdot ν^{n}

, which leads to

\begin{matrix} {\tilde{a}}_{n} & = & d^{n - 1} \sum_{k = 0}^{n - 1} (c \cdot d^{- k} + K_{1} \cdot {(\frac{ϰ}{d})}^{k} + K_{2} \cdot {(\frac{ν}{d})}^{k}) \\ = & d^{n - 1} \cdot [c \cdot \frac{1 - d^{- n}}{1 - d^{- 1}} + K_{1} \cdot \frac{1 - {(\frac{ϰ}{d})}^{n}}{1 - \frac{ϰ}{d}} + K_{2} \cdot \frac{1 - {(\frac{ν}{d})}^{n}}{1 - \frac{ν}{d}}] \\ = & \frac{c}{1 - d} (1 - d^{n}) + K_{1} \cdot \frac{d^{n} - ϰ^{n}}{d - ϰ} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν} . \end{matrix}

(A37)

Hence, for the corresponding sum we get

\begin{matrix} \sum_{k = 1}^{n} {\tilde{a}}_{k} & = & \sum_{k = 1}^{n} [\frac{c}{1 - d} + (\frac{K_{1}}{d - ϰ} + \frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \cdot d^{k} - \frac{K_{1}}{d - ϰ} \cdot ϰ^{k} - \frac{K_{2}}{d - ν} \cdot ν^{k}] \\ = & \frac{c}{1 - d} \cdot n + (\frac{K_{1}}{d - ϰ} + \frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \cdot \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{1} \cdot ϰ \cdot (1 - ϰ^{n})}{(d - ϰ) (1 - ϰ)} - \frac{K_{2} \cdot ν \cdot (1 - ν^{n})}{(d - ν) (1 - ν)} . \end{matrix}

(A38)

Consider now the case

0 \leq ν < ϰ = d

. Then some expressions in (A37) and (A38) have a zero denominator. In this case, the evaluation of (A36) becomes

\begin{matrix} {\tilde{a}}_{n} & = & d^{n - 1} \sum_{k = 0}^{n - 1} (c \cdot d^{- k} + K_{1} + K_{2} \cdot {(\frac{ν}{d})}^{k}) = d^{n - 1} \cdot [c \cdot \frac{1 - d^{- n}}{1 - d^{- 1}} + K_{1} \cdot n + K_{2} \cdot \frac{1 - {(\frac{ν}{d})}^{n}}{1 - \frac{ν}{d}}] \\ = & \frac{c}{1 - d} (1 - d^{n}) + K_{1} \cdot n \cdot d^{n - 1} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν} . \end{matrix}

(A39)

Before we calculate the corresponding sum

\sum_{k = 1}^{n} {\tilde{a}}_{k}

, we notice that

\sum_{k = 1}^{n} k \cdot d^{k - 1} = \sum_{k = 1}^{n} \frac{\partial}{\partial d} d^{k} = \frac{\partial}{\partial d} \sum_{k = 1}^{n} d^{k} = \frac{\partial}{\partial d} (\frac{d \cdot (1 - d^{n})}{1 - d}) = \frac{1 - n \cdot d^{n} (1 - d) - d^{n}}{{(1 - d)}^{2}} .

Using this fact, we obtain

\begin{matrix} \sum_{k = 1}^{n} {\tilde{a}}_{k} = \sum_{k = 1}^{n} [\frac{c}{1 - d} (1 - d^{k}) + K_{1} \cdot k \cdot d^{k - 1} + K_{2} \cdot \frac{d^{k} - ν^{k}}{d - ν}] \\ = \frac{c}{1 - d} \cdot n + \sum_{k = 1}^{n} (\frac{K_{2}}{d - ν} - \frac{c}{1 - d}) d^{k} + K_{1} \sum_{k = 1}^{n} k \cdot d^{k - 1} - \frac{K_{2}}{d - ν} \sum_{k = 1}^{n} ν^{k} \\ = (\frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \frac{d \cdot (1 - d^{n})}{1 - d} + K_{1} \cdot \frac{1 - n \cdot d^{n} (1 - d) - d^{n}}{{(1 - d)}^{2}} - \frac{K_{2} \cdot ν (1 - ν^{n})}{(d - ν) (1 - ν)} + \frac{c}{1 - d} \cdot n \\ = (\frac{K_{1}}{d (1 - d)} + \frac{K_{2}}{d - ν} - \frac{c}{1 - d}) \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{2} \cdot ν (1 - ν^{n})}{(d - ν) (1 - ν)} + (\frac{c}{1 - d} - \frac{K_{1} \cdot d^{n}}{1 - d}) \cdot n . □ \end{matrix}

Proof of Lemma 3.

(a) In this case we have

0 < q < β_{λ}

. To prove part (i), we consider the function

ξ_{λ}^{(q)} (\cdot)

on

[x_{0}^{(q)}, 0]

, the range of the sequence

{(a_{n}^{(q)})}_{n \in N}

(recall Properties 1 (P1)). For tackling the left-hand inequality in (i), we compare

ξ_{λ}^{(q)} (x) = q \cdot e^{x} - β_{λ}

with the quadratic function

{\underset{̲}{Υ}}_{λ}^{(q)} (x) : = \frac{q}{2} e^{x_{0}^{(q)}} \cdot x^{2} + q e^{x_{0}^{(q)}} (1 - x_{0}^{(q)}) \cdot x + x_{0}^{(q)} (1 - q e^{x_{0}^{(q)}} + \frac{q}{2} e^{x_{0}^{(q)}} x_{0}^{(q)}) .

(A40)

Clearly, one has the relations

{\underset{̲}{Υ}}_{λ}^{(q)} (x_{0}^{(q)}) = x_{0}^{(q)} = ξ_{λ}^{(q)} (x_{0}^{(q)})

,

{\underset{̲}{Υ}}_{λ}^{(q)'} (x_{0}^{(q)}) = q \cdot e^{x_{0}^{(q)}} = ξ_{λ}^{(q)'} (x_{0}^{(q)})

, and

{\underset{̲}{Υ}}_{λ}^{(q)''} (x) < ξ_{λ}^{(q)''} (x)

for all

x \in] x_{0}^{(q)}, 0]

. Hence,

{\underset{̲}{Υ}}_{λ}^{(q)} (\cdot)

is on

] x_{0}^{(q)}, 0]

a strict lower functional bound of

ξ_{λ}^{(q)} (\cdot)

. We are now ready to prove the left-hand inequality in (i) by induction. For

n = 1

, we easily see that

{\underset{̲}{a}}_{1}^{(q)} < a_{1}^{(q)}

iff

x_{0}^{(q)} (1 - q e^{x_{0}^{(q)}} + \frac{q}{2} e^{x_{0}^{(q)}} x_{0}^{(q)}) < q - β_{λ}

iff

{\underset{̲}{Υ}}_{λ}^{(q)} (0) < ξ_{λ}^{(q)} (0)

, and the latter is obviously true. Let us assume that

{\underset{̲}{a}}_{n}^{(q)} \leq a_{n}^{(q)}

holds. From this, (93), (78) and (80) we obtain

\begin{matrix} 0 < {\underset{̲}{ρ}}_{n}^{(q)} = \frac{q}{2} e^{x_{0}^{(q)}} {(x_{0}^{(q)} \cdot {(q \cdot e^{x_{0}^{(q)}})}^{n})}^{2} = \frac{q}{2} e^{x_{0}^{(q)}} {(a_{n}^{(q), T} - x_{0}^{(q)})}^{2} \\ < \frac{q}{2} e^{x_{0}^{(q)}} {(a_{n}^{(q)} - x_{0}^{(q)})}^{2} = {\underset{̲}{Υ}}_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), T} \cdot a_{n}^{(q)} - x_{0}^{(q)} \cdot (1 - d^{(q), T}) \\ < ξ_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), T} \cdot a_{n}^{(q)} - x_{0}^{(q)} \cdot (1 - d^{(q), T}) \\ < a_{n + 1}^{(q)} - d^{(q), T} \cdot {\underset{̲}{a}}_{n}^{(q)} - x_{0}^{(q)} \cdot (1 - d^{(q), T}) = a_{n + 1}^{(q)} - ξ_{λ}^{(q), T} ({\underset{̲}{a}}_{n}^{(q)}) . \end{matrix}

Thus, there holds

{\underset{̲}{a}}_{n + 1}^{(q)} < a_{n + 1}^{(q)}

. For the right-hand inequality in (i), we proceed analogously:

{\bar{Υ}}_{λ}^{(q)} (x) : = \frac{q}{2} e^{x_{0}^{(q)}} \cdot x^{2} + (1 - \frac{q}{2} e^{x_{0}^{(q)}} x_{0}^{(q)} - \frac{q - β_{λ}}{x_{0}^{(q)}}) \cdot x + q - β_{λ}

(A41)

satisfies

{\bar{Υ}}_{λ}^{(q)} (x_{0}^{(q)}) = x_{0}^{(q)} = ξ_{λ}^{(q)} (x_{0}^{(q)})

,

{\bar{Υ}}_{λ}^{(q)} (0) = q - β_{λ} = ξ_{λ}^{(q)} (0)

as well as

{\bar{Υ}}_{λ}^{(q)''} (x) < ξ_{λ}^{(q)''} (x)

for all

x \in] x_{0}^{(q)}, 0]

. Hence,

{\bar{Υ}}_{λ}^{(q)} (\cdot)

is on

] x_{0}^{(q)}, 0]

a strict upper functional bound of

ξ_{λ}^{(q)} (\cdot)

. Let us first observe the obvious relation

{\bar{a}}_{1}^{(q)} = q - β_{λ} = a_{1}^{(q)} < 0

, and assume that

{\bar{a}}_{n}^{(q)} \geq a_{n}^{(q)}

(

n \in N

) holds. From this, (95), (79), and (80) we obtain the desired inequality

{\bar{a}}_{n + 1}^{(q)} > a_{n + 1}^{(q)}

by

\begin{matrix} 0 > {\bar{ρ}}_{n}^{(q)} = - Γ_{<}^{(q)} {(d^{(q), T})}^{n} \cdot \frac{a_{n}^{(q), S}}{x_{0}^{(q)}} = \frac{q}{2} e^{x_{0}^{(q)}} (a_{n}^{(q), T} - x_{0}^{(q)}) \cdot a_{n}^{(q), S} \\ \geq \frac{q}{2} e^{x_{0}^{(q)}} (a_{n}^{(q)} - x_{0}^{(q)}) \cdot a_{n}^{(q)} = {\bar{Υ}}_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), S} \cdot a_{n}^{(q)} - (q - β_{λ}) \\ > ξ_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), S} \cdot a_{n}^{(q)} - (q - β_{λ}) \geq a_{n + 1}^{(q)} - d^{(q), S} \cdot {\bar{a}}_{n}^{(q)} - (q - β_{λ}) = a_{n + 1}^{(q)} - ξ_{λ}^{(q), S} ({\bar{a}}_{n}^{(q)}) . \end{matrix}

The explicit representations of the sequences

{(a_{n}^{(q)})}_{n \in N}

,

{({\underset{̲}{a}}_{n}^{(q)})}_{n \in N}

and

{({\bar{a}}_{n}^{(q)})}_{n \in N}

follow from (86) by incorporating the appropriate constants mentioned in the prelude of Lemma 3. With (83) to (85) and (86) we immediately achieve

{\underset{̲}{a}}_{n}^{(q)} > a_{n}^{(q), T}

for all

n \in N

. Analogously, for all

n \geq 2

, we get

{\bar{ρ}}_{n - 1} < 0

, which implies that

{\bar{a}}_{n}^{(q)} < a_{n}^{(q), S}

for all

n \geq 2

. For

n = 1

one obtains

{\bar{ρ}}_{0} = 0

as well as

{\bar{a}}_{1}^{(q)} = a_{1}^{(q), S} = a_{1}^{(q)} = q - β_{λ}

.

For the second part (ii), we employ the representation (A36) which leads to

\begin{matrix} {\underset{̲}{a}}_{n}^{(q)} = \sum_{k = 0}^{n - 1} {(d^{(q), T})}^{n - 1 - k} \cdot ({\underset{̲}{ρ}}_{k}^{(q)} + x_{0}^{(q)} \cdot (1 - d^{(q), T})) \\ as well as & {\bar{a}}_{n}^{(q)} = \sum_{k = 0}^{n - 1} {(d^{(q), S})}^{n - 1 - k} \cdot ({\bar{ρ}}_{k}^{(q)} + (q - β_{λ})) . \end{matrix}

The strict decreasingness of both sequences follows from

{\underset{̲}{ρ}}_{k}^{(q)} + x_{0}^{(q)} (1 - d^{(q), T}) = \frac{q e^{x_{0}^{(q)}}}{2} {(x_{0}^{(q)})}^{2} {(d^{(q), T})}^{2 n} + x_{0}^{(q)} (1 - d^{(q), T}) \leq {\underset{̲}{Υ}}_{λ}^{(q)} (0) < ξ_{λ}^{(q)} (0) = q - β_{λ} < 0

and from the fact that

{\bar{ρ}}_{k}^{(q)} \leq 0

for all

k \in N_{0}

and

q < β_{λ}

. Part (iii) follows directly from (i), since

d^{(q), T}, d^{(q), S} \in] 0, 1 [

.

Let us now prove part (b), where

max {0, β_{λ}} < q < min \{1, e^{β_{λ} - 1}\}

is assumed. To tackle part (i), we compare

ξ_{λ}^{(q)} (x) = q \cdot e^{x} - β_{λ}

with the quadratic function

{\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)} (x) : = \frac{q}{2} \cdot x^{2} + q \cdot (e^{x_{0}^{(q)}} - x_{0}^{(q)}) \cdot x + x_{0}^{(q)} (1 - q e^{x_{0}^{(q)}} + \frac{q}{2} x_{0}^{(q)}) > 0

(A42)

on the interval

[0, x_{0}^{(q)}]

. Clearly, we have

{\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)} (x_{0}^{(q)}) = ξ_{λ}^{(q)} (x_{0}^{(q)}) = x_{0}^{(q)}

,

{\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)'} (x_{0}^{(q)}) = ξ_{λ}^{(q)'} (x_{0}^{(q)}) = q e^{x_{0}^{(q)}}

and

0 < {\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)''} (x) < ξ_{λ}^{(q)''} (x)

for all

x \in] 0, x_{0}^{(q)}]

. Thus,

{\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)} (\cdot)

constitutes a positive functional lower bound for

ξ_{λ}^{(q)} (\cdot)

on

[0, x_{0}^{(q)}]

. Let us now prove the left-hand inequality of (i) by induction: for

n = 1

we get

{\underset{̲}{a}}_{1}^{(q)} = {\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)} (0) < ξ_{λ}^{(q)} (0) = a_{1}^{(q)}

. Moreover, by assuming

{\underset{̲}{a}}_{n}^{(q)} \leq a_{n}^{(q)}

for

n \in N

, we obtain with the above-mentioned considerations and (93), (80) and (82)

\begin{matrix} 0 < {\underset{̲}{ρ}}_{n}^{(q)} = Γ_{>}^{(q)} {(d^{(q), S})}^{2 n} = \frac{q}{2} \cdot {(a_{n}^{(q), S} - x_{0}^{(q)})}^{2} < \frac{q}{2} \cdot {(a_{n}^{(q)} - x_{0}^{(q)})}^{2} \\ = \frac{q}{2} {(a_{n}^{(q)})}^{2} + q \cdot (e^{x_{0}^{(q)}} - x_{0}^{(q)}) \cdot a_{n}^{(q)} + x_{0}^{(q)} \cdot (1 - q e^{x_{0}^{(q)}} + \frac{q}{2} x_{0}^{(q)}) - d^{(q), T} a_{n}^{(q)} - c^{(q), T} \\ = {\underset{̲}{\underset{̲}{υ}}}_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), T} a_{n}^{(q)} - c^{(q), T} < ξ_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), T} a_{n}^{(q)} - c^{(q), T} \\ < a_{n + 1}^{(q)} - d^{(q), T} {\underset{̲}{a}}_{n}^{(q)} - c^{(q), T} = a_{n + 1}^{(q)} - ξ_{λ}^{(q), T} ({\underset{̲}{a}}_{n}^{(q)}) . \end{matrix}

Hence,

{\underset{̲}{a}}_{n + 1}^{(q)} < a_{n + 1}^{(q)}

. For the right-hand inequality in part (i), we define the quadratic function

{\bar{\bar{υ}}}_{λ}^{(q)} (x) : = \frac{q}{2} \cdot x^{2} + (1 - \frac{q}{2} x_{0}^{(q)} - \frac{q - β_{λ}}{x_{0}^{(q)}}) \cdot x + q - β_{λ},

(A43)

which is a functional upper bound for

ξ_{λ}^{(q)} (\cdot)

on the interval

[0, x_{0}^{(q)}]

since there holds

{\bar{\bar{υ}}}_{λ}^{(q)} (0) = ξ_{λ}^{(q)} (0) = q - β_{λ}

,

{\bar{\bar{υ}}}_{λ}^{(q)} (x_{0}^{(q)}) = ξ_{λ}^{(q)} (x_{0}^{(q)}) = x_{0}^{(q)}

and additionally

{\bar{\bar{υ}}}_{λ}^{(q)''} (x) = q < q e^{x} = ξ_{λ}^{(q)''} (x)

on

] 0, x_{0}^{(q)} [

. Obviously,

{\bar{a}}_{1}^{(q)} = q - β_{λ} = a_{1}^{(q)}

. By assuming

{\bar{a}}_{n}^{(q)} \geq a_{n}^{(q)}

for

n \in N

, we obtain with (80), (82) and (95)

\begin{matrix} 0 > {\bar{ρ}}_{n}^{(q)} = - Γ_{>}^{(q)} \cdot {(d^{(q), S})}^{n} \cdot (1 - {(d^{(q), T})}^{n}) = - \frac{q}{2} \cdot (x_{0} - a_{n}^{(q), S}) \cdot a_{n}^{(q), T} \\ > - \frac{q}{2} \cdot (x_{0} - a_{n}^{(q)}) \cdot a_{n}^{(q)} = {\bar{\bar{υ}}}_{λ}^{(q)} (a_{n}^{(q)}) - \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}} \cdot a_{n}^{(q)} - (q - β_{λ}) \\ > ξ_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), S} a_{n}^{(q)} - c^{(q), S} > ξ_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), S} {\bar{a}}_{n}^{(q), S} - c^{(q), S} = a_{n + 1}^{(q)} - ξ_{λ}^{(q), S} ({\bar{a}}_{n}^{(q)}), \end{matrix}

(A44)

which implies

{\bar{a}}_{n + 1}^{(q)} > a_{n + 1}^{(q)}

. The explicit representations of the sequences

{({\underset{̲}{a}}_{n}^{(q)})}_{n \in N}

and

{({\bar{a}}_{n}^{(q)})}_{n \in N}

follow from (86) by employing the appropriate constants mentioned in the prelude of Lemma 3. By means of (83) to (85) and (86), we directly get

{\underset{̲}{a}}_{n}^{(q)} > a_{n}^{(q), T}

for all

n \in N

, whereas

{\bar{a}}_{n}^{(q)} < a_{n}^{(q), S}

holds only for all

n \geq 2

, since

{\bar{ρ}}_{0} = 0

implies that

{\bar{a}}_{1}^{(q)} = a_{1}^{(q), S} = a_{1}^{(q)} = q - β_{λ}

.

The second part (ii) can be proved in the same way as part (ii) of (a), by employing the representation (A36). For the lower bound one has

{\underset{̲}{a}}_{n}^{(q)} = \sum_{k = 0}^{n - 1} {(d^{(q), T})}^{n - 1 - k} \cdot [c^{(q), T} + {\underset{̲}{ρ}}_{k}^{(q)}], with c^{(q), T} > 0 and {\underset{̲}{ρ}}_{k}^{(q)} > 0 .

For the upper bound we get

{\bar{a}}_{n}^{(q)} = \sum_{k = 0}^{n - 1} {(d^{(q), S})}^{n - 1 - k} \cdot [c^{(q), S} + {\bar{ρ}}_{k}^{(q)}],

hence it is enough to show

c^{(q), S} + {\bar{ρ}}_{n}^{(q)} > 0

for all

n \in N_{0}

. Considering the first two lines of calculation (A44) and incorporating

c^{(q), S} = q - β_{λ}

, this can be seen from

c^{(q), S} + {\bar{ρ}}_{n}^{(q)} > {\bar{\bar{υ}}}_{λ}^{(q)} (a_{n}^{(q)}) - \frac{x_{0}^{(q)} - (q - β_{λ})}{x_{0}^{(q)}} \cdot a_{n}^{(q)} = {\bar{\bar{υ}}}_{λ}^{(q)} (a_{n}^{(q)}) - d^{(q), S} \cdot a_{n}^{(q)} > 0,

because on

[0, x_{0}^{(q)}]

there holds

d^{(q), S} \cdot x < x < {\bar{\bar{υ}}}_{λ}^{(q)} (x)

. The last part (iii) can be easily deduced from (i) together with

{lim}_{n \to \infty} n \cdot {(d^{(q), S})}^{n - 1} = 0

. □

The proofs of all Theorems 5–9 are mainly based on the following

Lemma A4.

Recall the quantity

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

from (42) for general

p \geq 0, q > 0

(notice that we do not consider parameters

p < 0

,

q \leq 0

in Section 6) as well as the constants

d^{(q), T}, d^{(q), S}

and

Γ_{<}^{(q)}, Γ_{>}^{(q)}

defined in (76), (77) and (91). For all

(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P \times R \ {0, 1}

, all initial population sizes

X_{0} \in N

and all observation horizons

n \in N

there holds

(a): in the case $p \geq 0$ and $0 < q < β_{λ}$

$\begin{matrix} {\tilde{B}}_{λ, X_{0}, n}^{(p, q)} & \geq & exp {x_{0}^{(q)} \cdot [X_{0} - \frac{p}{q} \cdot \frac{d^{(q), T}}{1 - d^{(q), T}}] \cdot (1 - {(d^{(q), T})}^{n}) + (\frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ}) \cdot n \\ + {\underset{̲}{ζ}}_{n}^{(q)} \cdot X_{0} + \frac{p}{q} \cdot {\underset{̲}{ϑ}}_{n}^{(q)}} = : C_{λ, X_{0}, n}^{(p, q), L}, \end{matrix}$

(A45)

$\begin{matrix} {\tilde{B}}_{λ, X_{0}, n}^{(p, q)} & \leq & exp {x_{0}^{(q)} \cdot [X_{0} - \frac{p}{q} \cdot \frac{d^{(q), S}}{1 - d^{(q), S}}] \cdot (1 - {(d^{(q), S})}^{n}) + (\frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ}) \cdot n \\ - {\bar{ζ}}_{n}^{(q)} \cdot X_{0} - \frac{p}{q} \cdot {\bar{ϑ}}_{n}^{(q)}} = : C_{λ, X_{0}, n}^{(p, q), U}, \end{matrix}$

(A46)

$\begin{matrix} where & {\underset{̲}{ζ}}_{n}^{(q)} : = Γ_{<}^{(q)} \cdot \frac{{(d^{(q), T})}^{n - 1}}{1 - d^{(q), T}} \cdot (1 - {(d^{(q), T})}^{n}) > 0, \end{matrix}$

(A47)

$\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q)} & : = & Γ_{<}^{(q)} \cdot \frac{1 - {(d^{(q), T})}^{n}}{{(1 - d^{(q), T})}^{2}} \cdot [1 - \frac{d^{(q), T} (1 + {(d^{(q), T})}^{n})}{1 + d^{(q), T}}] > 0, \end{matrix}$

(A48)

$\begin{matrix} {\bar{ζ}}_{n}^{(q)} & : = & Γ_{<}^{(q)} \cdot [\frac{{(d^{(q), S})}^{n} - {(d^{(q), T})}^{n}}{d^{(q), S} - d^{(q), T}} - {(d^{(q), S})}^{n - 1} \cdot \frac{1 - {(d^{(q), T})}^{n}}{1 - d^{(q), T}}] > 0, \end{matrix}$

(A49)

$\begin{matrix} {\bar{ϑ}}_{n}^{(q)} & : = & Γ_{<}^{(q)} \cdot \frac{d^{(q), T}}{1 - d^{(q), T}} \cdot [\frac{1 - {(d^{(q), S} d^{(q), T})}^{n}}{1 - d^{(q), S} d^{(q), T}} - \frac{{(d^{(q), S})}^{n} - {(d^{(q), T})}^{n}}{d^{(q), S} - d^{(q), T}}] > 0 . \end{matrix}$

(A50)
(b): in the case $p \geq 0$ and $0 < q = β_{λ}$

${\tilde{B}}_{λ, X_{0}, n}^{(p, q)} = exp \{(\frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ}) \cdot n\} = exp \{(p - α_{λ}) \cdot n\} .$
(c): in the case $p \geq 0$ and $max {0, β_{λ}} < q < min \{1, e^{β_{λ} - 1}\}$ the bounds $C_{λ, X_{0}, n}^{(p, q), L}$ and $C_{λ, X_{0}, n}^{(p, q), U}$ from (96) and (97) remain valid, but with

$\begin{matrix} {\underset{̲}{ζ}}_{n}^{(q)} & : = & Γ_{>}^{(q)} \cdot \frac{{(d^{(q), T})}^{n} - {(d^{(q), S})}^{2 n}}{d^{(q), T} - {(d^{(q), S})}^{2}} > 0, \end{matrix}$

(A51)

$\begin{matrix} {\underset{̲}{ϑ}}_{n}^{(q)} & : = & \frac{Γ_{>}^{(q)}}{d^{(q), T} - {(d^{(q), S})}^{2}} \cdot [\frac{d^{(q), T} \cdot (1 - {(d^{(q), T})}^{n})}{1 - d^{(q), T}} - \frac{{(d^{(q), S})}^{2} \cdot (1 - {(d^{(q), S})}^{2 n})}{1 - {(d^{(q), S})}^{2}}] > 0, \end{matrix}$

(A52)

$\begin{matrix} {\bar{ζ}}_{n}^{(q)} & : = & Γ_{>}^{(q)} \cdot {(d^{(q), S})}^{n - 1} \cdot [n - \frac{1 - {(d^{(q), T})}^{n}}{1 - d^{(q), T}}] > 0, \end{matrix}$

(A53)

$\begin{matrix} {\bar{ϑ}}_{n}^{(q)} & : = & Γ_{>}^{(q)} \cdot [\frac{d^{(q), S} - d^{(q), T}}{{(1 - d^{(q), S})}^{2} (1 - d^{(q), T})} \cdot (1 - {(d^{(q), S})}^{n}) \\ + \frac{d^{(q), T} (1 - {(d^{(q), S} d^{(q), T})}^{n})}{(1 - d^{(q), T}) (1 - d^{(q), S} d^{(q), T})} - \frac{{(d^{(q), S})}^{n}}{1 - d^{(q), S}} \cdot n] . \end{matrix}$

(A54)
(d): for the special choices $p : = p_{λ}^{E} : = α_{A}^{λ} α_{H}^{1 - λ} > 0, q : = q_{λ}^{E} : = β_{A}^{λ} β_{H}^{1 - λ} > 0$ in the parameter setup $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times] λ_{-}, λ_{+} [\ {0, 1}$ we obtain

$lim_{n \to \infty} \frac{1}{n} log (V_{λ, X_{0}, n}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), L}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p_{λ}^{E}, q_{λ}^{E}), U}) = \frac{α_{A}}{β_{A}} \cdot x_{0}^{(q_{λ}^{E})} .$
(e): for all general $p \geq 0$ with either $0 < q < β_{λ}$ or $max {0, β_{λ}} < q < min \{1, e^{β_{λ} - 1}\}$ we get

$lim_{n \to \infty} \frac{1}{n} log ({\tilde{B}}_{λ, X_{0}, n}^{(p, q)}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p, q), L}) = lim_{n \to \infty} \frac{1}{n} log (C_{λ, X_{0}, n}^{(p, q), U}) = \frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ} .$

Proof of Lemma A4.

The closed-form bounds

C_{λ, X_{0}, n}^{(p, q), L}

and

C_{λ, X_{0}, n}^{(p, q), U}

are obtained by substituting in the representation (42) (for

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

, cf. Theorem 1) the recursive sequence member

a_{n}^{(q)}

by the explicit sequence member

{\underset{̲}{a}}_{n}^{(q)}

respectively

{\bar{a}}_{n}^{(q)}

. From the definitions of these sequences (92) to (95) and from (83) to (85) one can see that we basically have to evaluate the term

exp \{({\tilde{a}}_{n}^{h o m} + {\tilde{c}}_{n}) \cdot X_{0} + \frac{p}{q} \cdot \sum_{k = 1}^{n} ({\tilde{a}}_{k}^{h o m} + {\tilde{c}}_{k}) + (\frac{p}{q} \cdot β_{λ} - α_{λ}) \cdot n\},

(A55)

where

{\tilde{a}}_{n}^{h o m} + {\tilde{c}}_{n} = {\tilde{a}}_{n}

is either interpreted as the lower approximate

{\underset{̲}{a}}_{n}^{(q)}

or as the upper approximate

{\bar{a}}_{n}^{(q)}

. After rearranging and incorporating that

\frac{c^{(q), S}}{1 - d^{(q), S}} = \frac{c^{(q), T}}{1 - d^{(q), T}} = x_{0}^{(q)}

in both approximate cases, we obtain with the help of (86), (87) for the expression (A55) in the case

0 \leq ν < ϰ < d

\begin{matrix} exp {x_{0}^{(q)} \cdot (1 - d^{n}) \cdot [X_{0} - \frac{p}{q} \cdot \frac{d}{1 - d}] + (\frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ}) \cdot n \\ + [K_{1} \cdot \frac{d^{n} - ϰ^{n}}{d - ϰ} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν}] \cdot X_{0} \\ + \frac{p}{q} \cdot [(\frac{K_{1}}{d - ϰ} + \frac{K_{2}}{d - ν}) \cdot \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{1} \cdot ϰ \cdot (1 - ϰ^{n})}{(d - ϰ) (1 - ϰ)} - \frac{K_{2} \cdot ν \cdot (1 - ν^{n})}{(d - ν) (1 - ν)}]} . \end{matrix}

(A56)

In the other case

0 \leq ν < ϰ = d

, the application of (88), (89) turns (A55) into

\begin{matrix} exp {x_{0}^{(q)} \cdot (1 - d^{n}) \cdot [X_{0} - \frac{p}{q} \cdot \frac{d}{1 - d}] + (\frac{p}{q} \cdot (β_{λ} + x_{0}^{(q)}) - α_{λ}) \cdot n \\ + [K_{1} \cdot n \cdot d^{n - 1} + K_{2} \cdot \frac{d^{n} - ν^{n}}{d - ν}] \cdot X_{0} \\ + \frac{p}{q} \cdot [(\frac{K_{1}}{d (1 - d)} + \frac{K_{2}}{d - ν}) \cdot \frac{d \cdot (1 - d^{n})}{1 - d} - \frac{K_{2} \cdot ν \cdot (1 - ν^{n})}{(d - ν) (1 - ν)} - \frac{K_{1} \cdot d^{n}}{1 - d} \cdot n]} . \end{matrix}

(A57)

After these preparatory considerations let us now begin with elaboration of the details.

(a) Let

0 < q < β_{λ}

. We obtain a closed-form lower bound for

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

by employing the parameters

c \hat{=} c^{(q), T}

,

d \hat{=} d^{(q), T}

,

K_{2} = ν = 0

,

K_{1} = Γ_{<}^{(q)},

and

ϰ = {(d^{(q), T})}^{2}

, cf. (93) in combination with (85). Since

ϰ < d^{(q), T}

, we have to plug in these parameters into (A56). The representations of

{\underset{̲}{ζ}}_{n}^{(q)}

and

{\underset{̲}{ϑ}}_{n}^{(q)}

in (A47) and (A48) follow immediately. For a closed-form upper bound, we employ the parameters

c \hat{=} c^{(q), S}

,

d \hat{=} d^{(q), S}

,

- K_{1} = K_{2} = Γ_{<}^{(q)}

,

ϰ = d^{(q), T}

and

ν = d^{(q), S} d^{(q), T}

(in particular,

ϰ < d^{(q), S}

implying that we have to use (A56)). From this, (A49) can be deduced directly; the representation (A50) comes from the expressions in the squared brackets in the last line of (A56) and from

\begin{matrix} - (\frac{Γ_{<}^{(q)}}{d^{(q), S} - d^{(q), T}} - \frac{Γ_{<}^{(q)}}{d^{(q), S} - d^{(q), S} d^{(q), T}}) \cdot \frac{d^{(q), S} \cdot (1 - {(d^{(q), S})}^{n})}{1 - d^{(q), S}} + \frac{Γ_{<}^{(q)} \cdot d^{(q), T} \cdot (1 - {(d^{(q), T})}^{n})}{(d^{(q), S} - d^{(q), T}) (1 - d^{(q), T})} \\ - \frac{Γ_{<}^{(q)} \cdot d^{(q), S} d^{(q), T} \cdot (1 - {(d^{(q), S} d^{(q), T})}^{n})}{(d^{(q), S} - d^{(q), S} d^{(q), T}) (1 - d^{(q), S} d^{(q), T})} \\ = - \frac{Γ_{<}^{(q)} \cdot d^{(q), T} (1 - d^{(q), S})}{d^{(q), S} (d^{(q), S} - d^{(q), T}) (1 - d^{(q), T})} \cdot \frac{d^{(q), S} \cdot (1 - {(d^{(q), S})}^{n})}{1 - d^{(q), S}} + \frac{Γ_{<}^{(q)} \cdot d^{(q), T} \cdot (1 - {(d^{(q), T})}^{n})}{(d^{(q), S} - d^{(q), T}) (1 - d^{(q), T})} \\ - \frac{Γ_{<}^{(q)} \cdot d^{(q), T} \cdot (1 - {(d^{(q), S} d^{(q), T})}^{n})}{(1 - d^{(q), T}) (1 - d^{(q), S} d^{(q), T})} \\ = - \frac{Γ_{<}^{(q)} \cdot d^{(q), T}}{1 - d^{(q), T}} \cdot [\frac{1 - {(d^{(q), S} d^{(q), T})}^{n}}{1 - d^{(q), S} d^{(q), T}} + \frac{1 - {(d^{(q), S})}^{n}}{d^{(q), S} - d^{(q), T}} - \frac{1 - {(d^{(q), T})}^{n}}{d^{(q), S} - d^{(q), T}}] \\ = - \frac{Γ_{<}^{(q)} \cdot d^{(q), T}}{1 - d^{(q), T}} \cdot [\frac{1 - {(d^{(q), S} d^{(q), T})}^{n}}{1 - d^{(q), S} d^{(q), T}} - \frac{{(d^{(q), S})}^{n} - {(d^{(q), T})}^{n}}{d^{(q), S} - d^{(q), T}}] = - {\bar{ϑ}}_{n}^{(q)} . \end{matrix}

Part (b) has already been mentioned in Remark 1 (b) and is due to the fact that for

0 < q = β_{λ}

, the sequence

{(a_{n}^{(q)})}_{n \in N}

is itself explicitly representable by

a_{n}^{(q)} = 0

for all

n \in N

(cf. Properties 1 (P2)). Plugging this into (42) gives the desired result.

(c) Let us now consider

max {0, β_{λ}} < q < min {1, e^{β_{λ} - 1}}

. For a closed-form lower bound for

{\tilde{B}}_{λ, X_{0}, n}^{(p, q)}

we have to employ the parameters

c \hat{=} c^{(q), T}

,

d \hat{=} d^{(q), T}

,

K_{2} = ν = 0

,

K_{1} = Γ_{>}^{(q)}

and

ϰ = {(d^{(q), S})}^{2}

, cf. (93) in combination with (85). The representations of

{\underset{̲}{ζ}}_{n}^{(q)}

and

{\underset{̲}{ϑ}}_{n}^{(q)}

in (A51) and (A52) follow immediately from (A56). For a closed-form upper bound, we use the parameters

c \hat{=} c^{(q), S}

,

d \hat{=} d^{(q), S}

,

- K_{1} = K_{2} = Γ_{>}^{(q)}

,

ϰ = d^{(q), S}

and

ν = d^{(q), S} d^{(q), T}

. Notice that in this case we stick to the representation (A57). The formula (104) is obviously valid, and (105) is implied by

\begin{matrix} (\frac{- Γ_{>}^{(q)}}{d^{(q), S} (1 - d^{(q), S})} + \frac{Γ_{>}^{(q)}}{d^{(q), S} - d^{(q), S} d^{(q), T}}) \cdot \frac{d^{(q), S} \cdot (1 - {(d^{(q), S})}^{n})}{1 - d^{(q), S}} \\ = & - Γ_{>}^{(q)} \cdot \frac{d^{(q), S} - d^{(q), T}}{{(1 - d^{(q), S})}^{2} (1 - d^{(q), T})} \cdot (1 - {(d^{(q), S})}^{n}) . \end{matrix}

The parts (d) and (e) are trivial by incorporating that in all respective cases one has

d^{(q), S} \in] 0, 1 [

,

d^{(q), T} \in] 0, 1 [

and

{lim}_{n \to \infty} n \cdot d^{(q), S} = 0

. □

Proof of Theorem 5.

(a) For

λ \in] 0, 1 [

, we get

0 < q_{λ}^{E} < β_{λ}

and the assertion follows by applying part (a) of Lemma A4. Notice that in the current subcase

P_{NI} \cup P_{SP, 1}

there holds

\frac{p_{λ}^{E}}{q_{λ}^{E}} β_{λ} - α_{λ} = 0

as well as

\frac{p_{λ}^{E}}{q_{λ}^{E}} = \frac{α_{A}}{β_{A}} = \frac{α_{H}}{β_{H}}

. For the case

λ \in R \ [0, 1]

, one gets from Lemma A1 that

max {0, β_{λ}} < q_{λ}^{E}

, and there holds

q_{λ}^{E} < min {1, e^{β_{λ} - 1}}

iff

λ \in] λ_{-}, λ_{+} [\ [0, 1]

, cf. Lemma 1. Thus, an application of part (c) of Lemma A4 proves the desired result. The assertion (b) is equivalent to part (d) of Lemma A4. □

Proof of Theorem 6.

The assertions follow immediately from (A45), Lemma A4(b),(e), Proposition 6(d) as well as the incorporation of the fact that for

λ \in] 0, 1 [

there holds

q_{λ}^{L} = β_{A}^{λ} β_{H}^{1 - λ} < β_{λ}

in the case

(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{SP} \ (P_{SP, 1} \cup P_{SP, 4}))

(i.e.,

β_{A} \neq β_{H}

) respectively

q_{λ}^{L} = β_{λ}

in the case

(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP, 4}

(i.e.,

β_{A} = β_{H}

). □

Proof of Theorem 7.

This can be deduced from (A46), from the parts (b), (c) and (e) of Lemma A4 as well as the incorporation of

p_{λ}^{U} \geq α_{A}^{λ} α_{H}^{1 - λ} > 0

for

λ \in] 0, 1 [

. Notice that an inadequate choice of

p_{λ}^{U}, q_{λ}^{U}

may lead to

\frac{p_{λ}^{U}}{q_{λ}^{U}} (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ} > 0

. □

Proof of Theorem 8.

The assertions follow immediately from (A45) and from the parts (b), (c) and (e) of Lemma A4. Notice that an inadequate choice of

p_{λ}^{L}, q_{λ}^{L}

may lead to

\frac{p_{λ}^{L}}{q_{λ}^{L}} (β_{λ} + x_{0}^{(q_{λ}^{U})}) - α_{λ} < 0

. □

Proof of Theorem 9.

Let

p_{λ}^{U} = α_{A}^{λ} α_{H}^{1 - λ} > max {0, α_{λ}}

and

q_{λ}^{U} = β_{A}^{λ} β_{H}^{1 - λ} > max {0, β_{λ}}

. Since

q_{λ}^{U} < min {1, e^{β_{λ} - 1}}

iff

λ \in] λ_{-}, λ_{+} [\ [0, 1]

(cf. Lemma 1 for

q_{λ} : = q_{λ}^{U})

), this theorem follows from (A46) of Lemma A4, from the parts (b), (e) of Lemma A4 and from part (d) of Proposition 14. □

Appendix A.4. Proofs and Auxiliary Lemmas for Section 7

Proof of Theorem 10.

As already mentioned above, one can adapt the proof of Theorem 9.1.3 in Ethier & Kurtz [138] who deal with drift-parameters

η = 0

,

κ_{•} = 0

, and the different setup of

σ -

independent time-scale and a sequence of critical Galton-Watson processes without immigration with general offspring distribution. For the sake of brevity, we basically outline here only the main differences to their proof; for similar limit investigations involving offspring/immigration distributions and parametrizations which are incompatble to ours, see e.g., Sriram [142].

As a first step, let us define the generator

A_{•} f (x) : = (η - κ_{•} \cdot x) \cdot f^{'} (x) + \frac{σ^{2}}{2} \cdot x \cdot f^{''} (x), f \in C_{c}^{\infty} ([0, \infty)),

which corresponds to the diffusion process

\tilde{X}

governed by (133). In connection with (130), we study

T_{•}^{(m)} f (x) : = E P_{•} [f (\frac{1}{m} (\sum_{k = 1}^{m x} Y_{0, k}^{(m)} + {\tilde{Y}}_{0}^{(m)}))], x \in E^{(m)} : = \frac{1}{m} N_{0}, f \in C_{c}^{\infty} ([0, \infty),

where the

Y_{0, k}^{(m)}

,

{\tilde{Y}}_{0}^{(m)}

are independent and (Poisson-

β_{•}^{(m)}

respectively Poisson-

α_{•}^{(m)}

) distributed as the members of the collection

Y^{(m)}

respectively

{\tilde{Y}}^{(m)}

. By the Theorems 8.2.1 and 1.6.5 as well as Corollary 4.8.9 of [138] it is sufficient to show

lim_{m \to \infty} sup_{x \in E^{(m)}} |σ^{2} m (T_{•}^{(m)} f (x) - f (x)) - A_{•} f (x)| = 0, f \in C_{c}^{\infty} ([0, \infty)) .

(A58)

But (A58) follows mainly from the next

Lemma A5.

Let

S_{n}^{(m)} : = \frac{1}{\sqrt{n}} (\sum_{k = 1}^{n} (Y_{0, k}^{(m)} - β_{•}^{(m)}) + {\tilde{Y}}_{0}^{(m)} - α_{•}^{(m)}), n \in N, m \in \bar{N},

with the usual convention

S_{0}^{(m)} : = 0

. Then for all

m \in \bar{N}

,

x \in E^{(m)}

and all

f \in C_{c}^{\infty} ([0, \infty))

\begin{matrix} ϵ^{(m)} (x) : = E P_{•} [\int_{0}^{1} {(S_{m x}^{(m)})}^{2} x (1 - v) (f^{''} (β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)}) - f^{''} (x)) d v] \\ = \frac{1}{σ^{2}} \cdot [σ^{2} m \cdot (T_{•}^{(m)} f (x) - f (x)) - A_{•} f (x)] + R^{(m)}, where lim_{m \to \infty} R^{(m)} = 0 . \end{matrix}

(A59)

Proof of Lemma A5.

Let us fix

f \in C_{c}^{\infty} ([0, \infty))

. From the involved Poissonian expectations it is easy to see that

lim_{m \to \infty} |σ^{2} m (T_{•}^{(m)} f (0) - f (0)) - A_{•} f (0)| = 0,

and thus (A49) holds for

x = 0

. Accordingly, we next consider the case

x \in E^{(m)} \ {0}

, with fixed

m \in \bar{N}

. From

E P_{•} [{(S_{m x}^{(m)})}^{2}] = β_{•}^{(m)} + \frac{α_{•}^{(m)}}{m x}

we obtain

E P_{•} [{(S_{m x}^{(m)})}^{2} x f^{''} (x) \int_{0}^{1} (1 - v) d v] = \frac{1}{2} (β_{•}^{(m)} \cdot x + \frac{α_{•}^{(m)}}{m}) f^{''} (x) = : a_{m x} \frac{f^{''} (x)}{2} = : a \frac{f^{''} (x)}{2} .

(A60)

Furthermore, with

b_{m x} : = b : = a + \sqrt{x / m} \cdot S_{m x}^{(m)} = \frac{1}{m} (\sum_{k = 1}^{m x} Y_{0, k}^{(m)} + {\tilde{Y}}_{0}^{(m)})

we get on

{S_{m x}^{(m)} \neq 0}

\int_{0}^{1} f^{''} (β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)}) d v = \sqrt{\frac{m}{x}} \cdot \frac{1}{S_{m x}^{(m)}} \int_{a}^{b} f^{''} (y) d y = \sqrt{\frac{m}{x}} \cdot \frac{f^{'} (b) - f^{'} (a)}{S_{m x}^{(m)}}

(A61)

as well as

\begin{matrix} \int_{0}^{1} v f^{''} (β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)}) d v = \frac{m}{x {(S_{m x}^{(m)})}^{2}} [\int_{a}^{b} y f^{''} (y) d y - a \int_{a}^{b} f^{''} (y) d y] \\ = \sqrt{\frac{m}{x}} \cdot \frac{f^{'} (b)}{S_{m x}^{(m)}} + \frac{m}{x} \cdot \frac{f (a) - f (b)}{{(S_{m x}^{(m)})}^{2}} . \end{matrix}

(A62)

With our choice

β_{•}^{(m)} = 1 - \frac{κ_{•}}{σ^{2} m}

and

α_{•}^{(m)} = β_{•}^{(m)} \cdot \frac{η}{σ^{2}}

, a Taylor expansion of f at x gives

\begin{matrix} f (a) = f (x) + \frac{1}{σ^{2} m} \cdot f^{'} (x) (β_{•}^{(m)} \cdot η - κ_{•} \cdot x) + o (\frac{1}{m}), \end{matrix}

(A63)

where for the case

η = κ = 0

we use the convention

o (\frac{1}{m}) \equiv 0

. Combining (A60) to (A63) and the centering

E P_{•} [S_{m x}^{(m)}] = 0

, the left hand side of Equation (A59) becomes

\begin{matrix} E P_{•} [\int_{0}^{1} {(S_{m x}^{(m)})}^{2} x (1 - v) (f^{''} (β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)}) - f^{''} (x)) d v] \\ = & E P_{•} [\sqrt{m x} \cdot S_{m x}^{(m)} \cdot (f^{'} (b) - f^{'} (a))] - E P_{•} [\sqrt{m x} \cdot S_{m x}^{(m)} \cdot f^{'} (b) + m \cdot (f (a) - f (b))] \\ - \frac{1}{2} (β_{•}^{(m)} \cdot x + \frac{α_{•}^{(m)}}{m}) \cdot f^{''} (x) \\ = & m \cdot (E P_{•} [f (b)] - f (a)) - \frac{1}{2} (β_{•}^{(m)} \cdot x + \frac{α_{•}^{(m)}}{m}) \cdot f^{''} (x) \\ = & m \cdot \{E P_{•} [f (\frac{1}{m} (\sum_{k = 1}^{m x} Y_{0, k}^{(m)} + {\tilde{Y}}_{0}))] - f (x)\} - \frac{1}{σ^{2}} A_{•} f (x) \\ + \frac{1}{σ^{2}} [(η - κ_{•} \cdot x) - β_{•}^{(m)} \cdot η + κ_{•} \cdot x] \cdot f^{'} (x) + \frac{x}{2} [1 - β_{•}^{(m)} - \frac{α_{•}^{(m)}}{m}] \cdot f^{''} (x) - m \cdot o (\frac{1}{m}) \end{matrix}

which immediately leads to the right hand side of (A59). □

To proceed with the proof of Theorem 10, we obtain for

m \geq 2 κ_{•} / σ^{2}

the inequality

β_{•}^{(m)} \geq 1 / 2

and accordingly for all

v \in] 0, 1 [

,

x \in E^{(m)}

β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)} = (1 - v) \cdot x \cdot β_{•}^{(m)} + (1 - v) \frac{α_{•}^{(m)}}{m} + v (\sum_{k = 1}^{m x} Y_{0, k}^{(m)} + {\tilde{Y}}_{0}) \geq x \cdot \frac{1 - v}{2} .

Suppose that the support of f is contained in the interval

[0, c]

. Correspondingly, for

v \leq 1 - 2 c / x

the integrand in

ϵ^{(m)} (x)

is zero and hence with (A64) we obtain the bounds

\begin{matrix} |\int_{0}^{1} {(S_{m x}^{(m)})}^{2} x (1 - v) (f^{''} (β_{•}^{(m)} x + \frac{α_{•}^{(m)}}{m} + v \sqrt{\frac{x}{m}} S_{m x}^{(m)}) - f^{''} (x)) d v| \\ \leq & \int_{0 \lor (1 - 2 c / x)}^{1} {(S_{m x}^{(m)})}^{2} x (1 - v) \cdot 2 {∥f^{''}∥}_{\infty} d v \leq x \cdot {(S_{m x}^{(m)})}^{2} {(1 \land \frac{2 c}{x})}^{2} {∥f^{''}∥}_{\infty} . \end{matrix}

From this, one can deduce

{lim}_{m \to \infty} {sup}_{x \in E^{(m)}} ϵ^{(m)} (x) = 0

–and thus (A58) – in the same manner as at the end of the proof of Theorem 9.1.3 in [138] (by means of the dominated convergence theorem).

Proof of Proposition 15.

Let

(κ_{A}, κ_{H}, η) \in {\tilde{P}}_{N I} \cup {\tilde{P}}_{S P, 1}

be fixed. We have to find those orders

λ \in R \ [0, 1]

which satisfy for all sufficiently large

m \in \bar{N}

q_{λ}^{(m)} = {(1 - \frac{κ_{A}}{σ^{2} m})}^{λ} {(1 - \frac{κ_{H}}{σ^{2} m})}^{1 - λ} < min \{1, e^{β_{λ}^{(m)} - 1}\} .

(A64)

In order to achieve this, we interpret

q_{λ}^{(m)} = q_{λ} (\frac{1}{m})

in terms of the function

q_{λ} (x) : = {(1 - \frac{κ_{A}}{σ^{2}} \cdot x)}^{λ} {(1 - \frac{κ_{H}}{σ^{2}} \cdot x)}^{1 - λ}, x \in] - ϵ, ϵ [,

(A65)

for some small enough

ϵ > 0

such that (A65) is well-defined. Since

β_{λ}^{(m)} - 1 = - \frac{κ_{λ}}{σ^{2} \cdot m} = - \frac{κ_{λ}}{σ^{2}} \cdot x = - \frac{λ κ_{A} + (1 - λ) κ_{H}}{σ^{2}} \cdot x

, for the verification of (A64) it suffices to show

\begin{matrix} lim_{x ↘ 0} \frac{1 - q_{λ} (x)}{x} & > & 0, \end{matrix}

(A66)

\begin{matrix} and lim_{x ↘ 0} \frac{e^{- \frac{κ_{λ}}{σ^{2}} \cdot x} - q_{λ} (x)}{x^{2}} & > & 0 . \end{matrix}

(A67)

By l’Hospital’s rule, one gets

{lim}_{x ↘ 0} \frac{1 - q_{λ} (x)}{x} = \frac{λ κ_{A} + (1 - λ) κ_{H}}{σ^{2}} = \frac{κ_{λ}}{σ^{2}}

and hence

\begin{matrix} (A66) & ⟺ & \{\begin{matrix} λ < \frac{κ_{H}}{κ_{H} - κ_{A}}, & if & κ_{A} < κ_{H}, \\ λ > - \frac{κ_{H}}{κ_{A} - κ_{H}}, & if & κ_{A} > κ_{H} . \end{matrix} \end{matrix}

(A68)

To find a condition that guarantees (A67), we use l’Hospital’s rule twice to deduce

lim_{x ↘ 0} \frac{e^{- \frac{κ_{λ}}{σ^{2}} \cdot x} - q_{λ} (x)}{x^{2}} = \frac{1}{2 σ^{4}} [κ_{λ}^{2} - λ (λ - 1) {(κ_{A} - κ_{H})}^{2}] = \frac{1}{2 σ^{4}} [λ κ_{A}^{2} + (1 - λ) κ_{H}^{2}]

and hence we obtain

\begin{matrix} (A67) & ⟺ & \{\begin{matrix} λ < \frac{κ_{H}^{2}}{κ_{H}^{2} - κ_{A}^{2}}, & if & κ_{A} < κ_{H}, \\ λ > - \frac{κ_{H}^{2}}{κ_{A}^{2} - κ_{H}^{2}}, & if & κ_{A} > κ_{H} . \end{matrix} \end{matrix}

(A69)

To compare both the lower and upper bounds in (A68) and (A69), let us calculate

\frac{κ_{H}^{2}}{κ_{H}^{2} - κ_{A}^{2}} - \frac{κ_{H}}{κ_{H} - κ_{A}} = - \frac{κ_{A} κ_{H}}{(κ_{H} - κ_{A}) (κ_{H} + κ_{A})} \{\begin{matrix} < 0, & if & κ_{A} < κ_{H}, \\ > 0, & if & κ_{A} > κ_{H} . \end{matrix}

(A70)

Incorporating this, we observe that both conditions (A66) and (A67) are satisfied simultaneously iff

\begin{matrix} λ & < & min \{\frac{κ_{H}}{κ_{H} - κ_{A}}, \frac{κ_{H}^{2}}{κ_{H}^{2} - κ_{A}^{2}}\} = \frac{κ_{H}^{2}}{κ_{H}^{2} - κ_{A}^{2}} if κ_{A} < κ_{H}, \\ λ & > & max \{- \frac{κ_{H}}{κ_{A} - κ_{H}}, - \frac{κ_{H}^{2}}{κ_{A}^{2} - κ_{H}^{2}}\} = - \frac{κ_{H}^{2}}{κ_{A}^{2} - κ_{H}^{2}} if κ_{A} > κ_{H}, \end{matrix}

which finishes the proof. □

The following lemma is the main tool for the proof of Theorem 11.

Lemma A6.

Let

(κ_{A}, κ_{H}, η, λ) \in ({\tilde{P}}_{N I} \cup {\tilde{P}}_{S P, 1}) \times (] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ {0, 1})

. By using the quantities

κ_{λ} : = λ κ_{A} + (1 - λ) κ_{H}

and

Λ_{λ} : = \sqrt{λ κ_{A}^{2} + (1 - λ) κ_{H}^{2}}

from (150) (which is well-defined, cf. (138)), one gets for all

t > 0

\begin{matrix} (a) & lim_{m \to \infty} m \cdot (1 - q_{λ}^{(m)}) = lim_{m \to \infty} m \cdot (1 - β_{λ}^{(m)}) = \frac{κ_{λ}}{σ^{2}} . \\ (b) & lim_{m \to \infty} m^{2} \cdot a_{1}^{(m)} = lim_{m \to \infty} m^{2} \cdot (q_{λ}^{(m)} - β_{λ}^{(m)}) = - \frac{λ (1 - λ) {(κ_{A} - κ_{H})}^{2}}{2 σ^{4}} = - \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{2 σ^{4}} . \\ (c) & lim_{m \to \infty} m \cdot x_{0}^{(m)} = - \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \{\begin{matrix} < 0, & if λ \in] 0, 1 [, \\ > 0, & if λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1] . \end{matrix} \\ (d) & lim_{m \to \infty} m^{2} \cdot Γ_{<}^{(m)} = lim_{m \to \infty} m^{2} \cdot Γ_{>}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{4}} > 0 . \\ (e) & lim_{m \to \infty} m \cdot (1 - d^{(m), S}) = \frac{Λ_{λ} + κ_{λ}}{2 σ^{2}} > 0 . \\ (f) & lim_{m \to \infty} m \cdot (1 - d^{(m), T}) = \frac{Λ_{λ}}{σ^{2}} > 0 . \\ (g) & lim_{m \to \infty} m \cdot (1 - d^{(m), S} d^{(m), T}) = \frac{3 Λ_{λ} + κ_{λ}}{2 σ^{2}} > 0 . \\ (h) & lim_{m \to \infty} {(d^{(m), S})}^{σ^{2} m t} = exp \{- \frac{Λ_{λ} + κ_{λ}}{2} \cdot t\} < 1 . \\ (i) & lim_{m \to \infty} {(d^{(m), T})}^{σ^{2} m t} = exp \{- Λ_{λ} \cdot t\} < 1 . \\ (j) & lim_{m \to \infty} {(d^{(m), S} d^{(m), T})}^{σ^{2} m t} = exp \{- \frac{3 Λ_{λ} + κ_{λ}}{2} \cdot t\} < 1 . \\ (k) & for λ \in] 0, 1 [, there holds for the respective quantities defined in (142) to (145) \\ lim_{m \to \infty} m \cdot {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot Λ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- Λ_{λ} \cdot t}) > 0, \\ lim_{m \to \infty} {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{1}{4} \cdot {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \cdot {(1 - e^{- Λ_{λ} \cdot t})}^{2} > 0, \\ lim_{m \to \infty} m \cdot {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{σ^{2}} \cdot [\frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} - e^{- Λ_{λ} \cdot t}}{Λ_{λ} - κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} (1 - e^{- Λ_{λ} \cdot t})}{2 \cdot Λ_{λ}}] > 0, \\ lim_{m \to \infty} {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{Λ_{λ}} \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{3 Λ_{λ} + κ_{λ}} + \frac{e^{- Λ_{λ} \cdot t} - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} - κ_{λ}}] > 0 . \\ (l) & for λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1], there holds for the respective quantities defined in (146) to (149) \\ lim_{m \to \infty} m \cdot {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot κ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- κ_{λ} \cdot t}) > 0, \\ lim_{m \to \infty} {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot κ_{λ}} \cdot [\frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}} - \frac{1 - e^{- (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}}] > 0, \\ lim_{m \to \infty} m \cdot {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} = \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot σ^{2}} \cdot e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} \cdot [t - \frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}}] > 0, \\ lim_{m \to \infty} {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} = {(Λ_{λ} - κ_{λ})}^{2} \cdot [\frac{(Λ_{λ} - κ_{λ}) (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t})}{Λ_{λ} \cdot {(Λ_{λ} + κ_{λ})}^{2}} \\ + \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} \cdot (3 Λ_{λ} + κ_{λ})} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}} \cdot t] > 0 . \end{matrix}

Proof of Lemma A6.

For each of the assertions (a) to (l), we will make use of l’Hospital’s rule. To begin with, we obtain for arbitrary

μ, ν \in R

\begin{matrix} lim_{m \to \infty} m [1 - {(β_{A}^{(m)})}^{μ} {(β_{H}^{(m)})}^{ν}] = lim_{m \to \infty} m^{2} [μ \cdot {(β_{A}^{(m)})}^{μ - 1} {(β_{H}^{(m)})}^{ν} \frac{κ_{A}}{σ^{2} m^{2}} + ν \cdot {(β_{A}^{(m)})}^{μ} {(β_{H}^{(m)})}^{ν - 1} \frac{κ_{H}}{σ^{2} m^{2}}] \\ = μ \frac{κ_{A}}{σ^{2}} + ν \frac{κ_{H}}{σ^{2}} . \end{matrix}

(A71)

From this, the first part of (a) follows immediately and the second part is a direct consequence of the definition of

β_{λ}^{(m)}

. Part (b) can be deduced from (A71):

\begin{matrix} lim_{m \to \infty} m^{2} \cdot a_{1}^{(m)} & = & lim_{m \to \infty} \frac{m}{2 σ^{2}} \cdot [λ \cdot κ_{A} (1 - {(β_{A}^{(m)})}^{λ - 1} {(β_{H}^{(m)})}^{1 - λ}) \\ + (1 - λ) \cdot κ_{H} (1 - {(β_{A}^{(m)})}^{λ} {(β_{H}^{(m)})}^{- λ})] = - \frac{λ (1 - λ) {(κ_{A} - κ_{H})}^{2}}{2 σ^{4}} = - \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{2 σ^{4}} . \end{matrix}

For the proof of (c), we rely on the inequalities

{\underset{̲}{x}}_{0}^{(m)} \leq x_{0}^{(m)} \leq {\bar{x}}_{0}^{(m)}

(

m \in N

), where

{\underset{̲}{x}}_{0}^{(m)}

and

{\bar{x}}_{0}^{(m)}

are the obvious notational adaptions of (124) and (126), respectively. Notice that

{\underset{̲}{x}}_{0}^{(m)}

and

{\bar{x}}_{0}^{(m)}

are solutions of the (again adapted) quadratic equations

{\underset{̲}{Q}}_{λ}^{(m)} (x) = x

resp.

{\bar{Q}}_{λ}^{(m)} (x) = x

(cf. (127) and (128)). These solutions clearly exist in the case

λ \in] 0, 1 [

. For sufficiently large approximations steps

m \in N

, these solutions also exist in the case

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

since (138) together with parts (a) and (b) imply

lim_{m \to \infty} {(m \cdot (1 - q_{λ}^{(m)}))}^{2} - 2 \cdot q_{λ}^{(m)} \cdot m^{2} \cdot a_{1}^{(m)} = σ^{- 2} \cdot [λ κ_{A}^{2} + (1 - λ) κ_{H}^{2}] > 0, for λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1] .

To prove part (c), we show that the limits of

{\underset{̲}{x}}_{0}^{(m)}

and

{\bar{x}}_{0}^{(m)}

coincide. Assume first that

λ \in] 0, 1 [

. Using (a) and (b), we obtain together with the obvious limit

{lim}_{m \to \infty} q_{λ}^{(m)} = 1

\begin{matrix} lim_{m \to \infty} m \cdot {\bar{x}}_{0}^{(m)} & = & lim_{m \to \infty} {(q_{λ}^{(m)})}^{- 1} \cdot [m \cdot (1 - q_{λ}^{(m)}) - \sqrt{{(m \cdot (1 - q_{λ}^{(m)}))}^{2} - 2 \cdot q_{λ}^{(m)} \cdot m^{2} \cdot a_{1}^{(m)}}] \\ = & \frac{κ_{λ}}{σ^{2}} - \sqrt{{(\frac{κ_{λ}}{σ^{2}})}^{2} + \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{σ^{4}}} = - \frac{Λ_{λ} - κ_{λ}}{σ^{2}} . \end{matrix}

(A72)

Let

{\underset{̲}{\underset{̲}{x}}}_{0}^{(m)}

be the adapted version of the auxiliary fixed-point lower bound defined in (125). By incorporating

{lim}_{m \to \infty} β_{λ}^{(m)} = 1

we obtain with (a) and (b)

lim_{m \to \infty} {\underset{̲}{\underset{̲}{x}}}_{0}^{(m)} = lim_{m \to \infty} max \{- β_{λ}^{(m)}, \frac{q_{λ}^{(m)} - β_{λ}^{(m)}}{1 - q_{λ}^{(m)}}\} = lim_{m \to \infty} \frac{1}{m} \cdot \frac{m^{2} \cdot a_{1}^{(m)}}{m \cdot (1 - q_{λ}^{(m)})} = 0,

which implies

\begin{matrix} lim_{m \to \infty} m \cdot {\underset{̲}{x}}_{0}^{(m)} & = & lim_{m \to \infty} \frac{e^{- {\underset{̲}{\underset{̲}{x}}}_{0}^{(m)}}}{q_{λ}^{(m)}} \cdot [m \cdot (1 - q_{λ}^{(m)}) - \sqrt{{(m \cdot (1 - q_{λ}^{(m)}))}^{2} - 2 \cdot e^{{\underset{̲}{\underset{̲}{x}}}_{0}^{(m)}} q_{λ}^{(m)} \cdot m^{2} \cdot a_{1}^{(m)}}] \\ = & \frac{κ_{λ}}{σ^{2}} - \sqrt{{(\frac{κ_{λ}}{σ^{2}})}^{2} + \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{σ^{4}}} = - \frac{Λ_{λ} - κ_{λ}}{σ^{2}} . \end{matrix}

(A73)

Combining (A72) and (A73), the desired result (c) follows for

λ \in] 0, 1 [

. Assume now that

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

. In this case the approximates

{\underset{̲}{x}}_{0}^{(m)}

and

{\bar{x}}_{0}^{(m)}

have a different form, given in (124) and (126). However, the calculations work out in the same way: with parts (a) and (b) we get

\begin{matrix} lim_{m \to \infty} m \cdot {\underset{̲}{x}}_{0}^{(m)} & = & lim_{m \to \infty} \frac{1}{q_{λ}^{(m)}} \cdot [m \cdot (1 - q_{λ}^{(m)}) - \sqrt{{(m \cdot (1 - q_{λ}^{(m)}))}^{2} - 2 \cdot q_{λ}^{(m)} \cdot m^{2} \cdot a_{1}^{(m)}}] \\ = & \frac{κ_{λ}}{σ^{2}} - \sqrt{{(\frac{κ_{λ}}{σ^{2}})}^{2} + \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{σ^{4}}} = - \frac{Λ_{λ} - κ_{λ}}{σ^{2}}, \end{matrix}

as well as

\begin{matrix} lim_{m \to \infty} m \cdot {\bar{x}}_{0}^{(m)} & = & lim_{m \to \infty} m \cdot (1 - q_{λ}^{(m)}) - \sqrt{{(m \cdot (1 - q_{λ}^{(m)}))}^{2} - 2 \cdot m^{2} \cdot a_{1}^{(m)}} \\ = & \frac{κ_{λ}}{σ^{2}} - \sqrt{{(\frac{κ_{λ}}{σ^{2}})}^{2} + \frac{Λ_{λ}^{2} - κ_{λ}^{2}}{σ^{4}}} = - \frac{Λ_{λ} - κ_{λ}}{σ^{2}}, \end{matrix}

which finally finishes the proof of part (c). Assertion (d) is a direct consequence of (c). Since the representations of the parameters

c^{(m), S},

d^{(m), S},

c^{(m), T},

d^{(m), T}

are the same in both cases

λ \in] 0, 1 [

and

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

, the following considerations hold generally. Part (e) follows from (b) and (c) by

lim_{m \to \infty} m \cdot (1 - d^{(m), S}) = lim_{m \to \infty} \frac{m^{2} \cdot a_{1}^{(m)}}{m \cdot x_{0}^{(m)}} = \frac{Λ_{λ} + κ_{λ}}{2 σ^{2}} > 0 .

Notice that this term is positive since on

] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ {0, 1}

there holds

κ_{λ} > 0

as well as

Λ_{λ} > 0

, cf. (A70). To prove (f), we apply the general limit

{lim}_{x \to 0} \frac{e^{x} - 1}{x} = 1

and get with (a), (c)

lim_{m \to \infty} m \cdot (1 - d^{(m), T}) = lim_{m \to \infty} (m \cdot (1 - q_{λ}^{(m)}) - q_{λ}^{(m)} \cdot m \cdot x_{0}^{(m)} \cdot \frac{e^{x_{0}^{(m)}} - 1}{x_{0}^{(m)}}) = \frac{Λ_{λ}}{σ^{2}} .

The limit (g) can be obtained from (e) and (f):

lim_{m \to \infty} m \cdot (1 - d^{(m), S} d^{(m), T}) = lim_{m \to \infty} \{m \cdot (1 - d^{(m), S}) + d^{(m), S} \cdot m \cdot (1 - d^{(m), T})\} = \frac{3 Λ_{λ} + κ_{λ}}{2 σ^{2}} .

The assertions (h) resp. (i) resp. (j) follow from (e) resp. (f) resp. (g) by using the general relation

{lim}_{m \to \infty} {(1 + \frac{x_{m}}{m})}^{m} = exp \{{lim}_{m \to \infty} x_{m}\}

. To get the last two parts (k) and (l), we make repeatedly use of the results (a) to (j) and combine them with the formulas (142) to (149) of Corollary 14. More detailed, for

λ \in] 0, 1 [

(and thus

q_{λ}^{(m)} < β_{λ}^{(m)}

) we obtain

\begin{matrix} m \cdot {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{<}^{(m)} \cdot \frac{{(d^{(m), T})}^{⌊σ^{2} m t⌋ - 1}}{m \cdot (1 - d^{(m), T})} \cdot (1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}) \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot Λ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- Λ_{λ} \cdot t}) > 0, \\ {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{<}^{(m)} \cdot \frac{1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}}{{(m \cdot (1 - d^{(m), T}))}^{2}} \cdot [1 - \frac{d^{(m), T} (1 + {(d^{(m), T})}^{⌊σ^{2} m t⌋})}{1 + d^{(m), T}}] \\ \overset{m \to \infty}{⟶} & \frac{1}{4} \cdot {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \cdot {(1 - e^{- Λ_{λ} \cdot t})}^{2} > 0, \end{matrix}

\begin{matrix} m \cdot {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{<}^{(m)} \cdot [\frac{{(d^{(m), S})}^{⌊σ^{2} m t⌋} - {(d^{(m), T})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), T}) - m \cdot (1 - d^{(m), S})} \\ - {(d^{(m), S})}^{⌊σ^{2} m t⌋ - 1} \cdot \frac{1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), T})}] \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{σ^{2}} \cdot [\frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} - e^{- Λ_{λ} \cdot t}}{Λ_{λ} - κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} (1 - e^{- Λ_{λ} \cdot t})}{2 \cdot Λ_{λ}}] > 0, \end{matrix}

\begin{matrix} {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} & = & \frac{m^{2} \cdot Γ_{<}^{(m)} \cdot d^{(m), T}}{m \cdot (1 - d^{(m), T})} \cdot [\frac{1 - {(d^{(m), S} d^{(m), T})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), S} d^{(m), T})} - \frac{{(d^{(m), S})}^{⌊σ^{2} m t⌋} - {(d^{(m), T})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), T}) - m \cdot (1 - d^{(m), S})}] \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{Λ_{λ}} \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{3 Λ_{λ} + κ_{λ}} + \frac{e^{- Λ_{λ} \cdot t} - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} - κ_{λ}}] > 0 . \end{matrix}

For

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

(and thus

q_{λ}^{(m)} > β_{λ}^{(m)}

) we get

\begin{matrix} m \cdot {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{>}^{(m)} \cdot \frac{{(d^{(m), T})}^{⌊σ^{2} m t⌋} - {(d^{(m), S})}^{2 \cdot ⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), S}) (1 + d^{(m), S}) - m \cdot (1 - d^{(m), T})} \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot κ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- κ_{λ} \cdot t}) > 0, \\ {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} & = & \frac{m^{2} \cdot Γ_{>}^{(m)}}{m \cdot (1 - d^{(m), S}) (1 + d^{(m), S}) - m \cdot (1 - d^{(m), T})} \\ \cdot [\frac{d^{(m), T} \cdot (1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋})}{m \cdot (1 - d^{(m), T})} - \frac{{(d^{(m), S})}^{2} \cdot (1 - {(d^{(m), S})}^{2 \cdot ⌊σ^{2} m t⌋})}{m \cdot (1 - d^{(m), S}) (1 + d^{(m), S})}] \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot κ_{λ}} \cdot [\frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}} - \frac{1 - e^{- (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}}] > 0, \\ m \cdot {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{>}^{(m)} \cdot {(d^{(m), S})}^{⌊σ^{2} m t⌋ - 1} \cdot [\frac{⌊σ^{2} m t⌋}{m} - \frac{1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), T})}] \\ \overset{m \to \infty}{⟶} & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 \cdot σ^{2}} \cdot e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} \cdot [t - \frac{1 - e^{- Λ_{λ} \cdot t}}{Λ_{λ}}] > 0, \end{matrix}

\begin{matrix} {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)} & = & m^{2} \cdot Γ_{>}^{(m)} \cdot [\frac{m \cdot (1 - d^{(m), T}) - m \cdot (1 - d^{(m), S})}{m^{2} \cdot {(1 - d^{(m), S})}^{2} \cdot m \cdot (1 - d^{(m), T})} \cdot (1 - {(d^{(m), S})}^{⌊σ^{2} m t⌋}) \\ + \frac{d^{(m), T} (1 - {(d^{(m), S} d^{(m), T})}^{⌊σ^{2} m t⌋})}{m \cdot (1 - d^{(m), T}) \cdot m \cdot (1 - d^{(m), S} d^{(m), T})} - \frac{{(d^{(m), S})}^{⌊σ^{2} m t⌋}}{m \cdot (1 - d^{(m), S})} \cdot \frac{⌊σ^{2} m t⌋}{m}] \\ \overset{m \to \infty}{⟶} & {(Λ_{λ} - κ_{λ})}^{2} \cdot [\frac{(Λ_{λ} - κ_{λ}) (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t})}{Λ_{λ} \cdot {(Λ_{λ} + κ_{λ})}^{2}} + \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} \cdot (3 Λ_{λ} + κ_{λ})} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}} \cdot t] > 0 . □ \end{matrix}

Proof of Theorem 11.

It suffices to compute the limits of the bounds given in Corollary 14 as m tends to infinity. This is done by applying Lemma A6 which provides corresponding limits of all quantities of interest. Accordingly, for all

t > 0

the lower bound (153) in the case

λ \in] 0, 1 [

can be obtained from (140), (142) and (143) by

\begin{matrix} lim_{m \to \infty} exp {x_{0}^{(m)} \cdot [X_{0}^{(m)} - \frac{η}{σ^{2}} \cdot \frac{d^{(m), T}}{1 - d^{(m), T}}] (1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}) \\ + x_{0}^{(m)} \frac{η}{σ^{2}} \cdot ⌊σ^{2} m t⌋ + {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot X_{0}^{(m)} + {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} \\ = lim_{m \to \infty} exp {m \cdot x_{0}^{(m)} \cdot [\frac{X_{0}^{(m)}}{m} - \frac{η}{σ^{2}} \cdot \frac{d^{(m), T}}{m \cdot (1 - d^{(m), T})}] (1 - {(d^{(m), T})}^{⌊σ^{2} m t⌋}) \\ + m \cdot x_{0}^{(m)} \frac{η}{σ^{2}} \cdot \frac{⌊σ^{2} m t⌋}{m} + m \cdot {\underset{̲}{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot \frac{X_{0}^{(m)}}{m} + {\underset{̲}{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} \\ = exp {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \cdot [{\tilde{X}}_{0} - \frac{η}{σ^{2}} \cdot \frac{σ^{2}}{Λ_{λ}}] (1 - e^{- Λ_{λ} t}) - \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \cdot \frac{η}{σ^{2}} \cdot σ^{2} t \\ + \frac{{(Λ_{λ} - κ_{λ})}^{2}}{2 σ^{2} \cdot Λ_{λ}} \cdot e^{- Λ_{λ} \cdot t} \cdot (1 - e^{- Λ_{λ} \cdot t}) \cdot {\tilde{X}}_{0} + \frac{η}{4 σ^{2}} \cdot {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \cdot {(1 - e^{- Λ_{λ} \cdot t})}^{2}} \\ = exp \{- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [{\tilde{X}}_{0} - \frac{η}{Λ_{λ}}] (1 - e^{- Λ_{λ} \cdot t}) - \frac{η}{σ^{2}} (Λ_{λ} - κ_{λ}) \cdot t + L_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} + \frac{η}{σ^{2}} \cdot L_{λ}^{(2)} (t)\} . \end{matrix}

For all

t > 0

, the upper bound (154) in the case

λ \in] 0, 1 [

follows analogously from (141), (144), (145) by

\begin{matrix} lim_{m \to \infty} exp {x_{0}^{(m)} \cdot [X_{0}^{(m)} - \frac{η}{σ^{2}} \cdot \frac{d^{(m), S}}{1 - d^{(m), S}}] (1 - {(d^{(m), S})}^{⌊σ^{2} m t⌋}) \\ + x_{0}^{(m)} \frac{η}{σ^{2}} \cdot ⌊σ^{2} m t⌋ - {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot X_{0}^{(m)} - {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} \\ = lim_{m \to \infty} exp {m \cdot x_{0}^{(m)} \cdot [\frac{X_{0}^{(m)}}{m} - \frac{η}{σ^{2}} \cdot \frac{d^{(m), S}}{m \cdot (1 - d^{(m), S})}] (1 - {(d^{(m), S})}^{⌊σ^{2} m t⌋}) \\ + m \cdot x_{0}^{(m)} \frac{η}{σ^{2}} \cdot \frac{⌊σ^{2} m t⌋}{m} - m \cdot {\bar{ζ}}_{⌊σ^{2} m t⌋}^{(m)} \cdot \frac{X_{0}^{(m)}}{m} - {\bar{ϑ}}_{⌊σ^{2} m t⌋}^{(m)}} \\ = exp {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [{\tilde{X}}_{0} - \frac{η}{σ^{2}} \cdot \frac{2 σ^{2}}{Λ_{λ} + κ_{λ}}] (1 - (e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t})) - \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \cdot \frac{η}{σ^{2}} \cdot σ^{2} t \\ - \frac{{(Λ_{λ} - κ_{λ})}^{2}}{σ^{2}} \cdot [\frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} - e^{- Λ_{λ} \cdot t}}{Λ_{λ} - κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} (1 - e^{- Λ_{λ} \cdot t})}{2 \cdot Λ_{λ}}] \cdot {\tilde{X}}_{0} \\ - \frac{η}{σ^{2}} \frac{{(Λ_{λ} - κ_{λ})}^{2}}{Λ_{λ}} \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) \cdot t}}{3 Λ_{λ} + κ_{λ}} + \frac{e^{- Λ_{λ} \cdot t} - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} - κ_{λ}}]} \\ = exp {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [{\tilde{X}}_{0} - \frac{η}{\frac{1}{2} (Λ_{λ} + κ_{λ})}] (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}) - \frac{η}{σ^{2}} (Λ_{λ} - κ_{λ}) \cdot t \\ - U_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} - \frac{η}{σ^{2}} \cdot U_{λ}^{(2)} (t)} . \end{matrix}

In the case

λ \in] {\tilde{λ}}_{-}, {\tilde{λ}}_{+} [\ [0, 1]

, the lower bound as well as the upper bound of the Hellinger integral limit is obtained analogously, by taking into account that the quantities

{\underset{̲}{ζ}}_{n}^{(m)}, {\underset{̲}{ϑ}}_{n}^{(m)}, {\bar{ζ}}_{n}^{(m)}, {\bar{ϑ}}_{n}^{(m)}

now have the form (146) to (149) instead of (142) to (145). Thus, the functions

L_{λ}^{(1)} (t), U_{λ}^{(1)} (t), L_{λ}^{(2)} (t), U_{λ}^{(2)} (t)

are obtained by employing the limits of part (l) of Lemma A6 instead of part (k). □

The next Lemma (and parts of its proof) will be useful for the verification of Theorem 12:

Lemma A7.

Recall the bounds on the Hellinger integral

m -

limit given in (153) and (154) of Theorem 11, in terms of

L_{λ}^{(i)} (t)

and

U_{λ}^{(i)} (t)

(

i = 1, 2

) defined by (155) to (158). Correspondingly, one gets the following

λ -

limits for all

t \in [0, \infty [

:

(a): for all $κ_{A} \in] 0, \infty [$ and all $κ_{H} \in [0, \infty [$ with $κ_{A} \neq κ_{H}$

$lim_{λ ↗ 1} \frac{\partial L_{λ}^{(1)} (t)}{\partial λ} = lim_{λ ↗ 1} \frac{\partial L_{λ}^{(2)} (t)}{\partial λ} = lim_{λ ↗ 1} \frac{\partial U_{λ}^{(1)} (t)}{\partial λ} = lim_{λ ↗ 1} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ} = 0 .$

(A74)
(b): for $κ_{A} = 0$ and all $κ_{H} \in] 0, \infty [$

$\begin{matrix} lim_{λ ↗ 1} \frac{\partial L_{λ}^{(1)} (t)}{\partial λ} = - \frac{κ_{H}^{2} \cdot t}{2 σ^{2}}, \end{matrix}$

(A75)

$\begin{matrix} lim_{λ ↗ 1} \frac{\partial L_{λ}^{(2)} (t)}{\partial λ} = - \frac{κ_{H}^{2} \cdot t^{2}}{4}, \end{matrix}$

(A76)

$\begin{matrix} lim_{λ ↗ 1} \frac{\partial U_{λ}^{(1)} (t)}{\partial λ} = lim_{λ ↗ 1} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ} = 0 . \end{matrix}$

(A77)

Proof of Lemma A7.

For all

κ_{A}, κ_{H} \in [0, \infty [

with

κ_{A} \neq κ_{H}

one can deduce from (150) as well as (155) to (158) the following derivatives:

\begin{matrix} \frac{\partial L_{λ}^{(1)} (t)}{\partial λ} = \frac{1}{2 σ^{2}} {\frac{t}{2} {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} (κ_{A}^{2} - κ_{H}^{2}) [2 e^{- 2 Λ_{λ} t} - e^{- Λ_{λ} t}] \\ + e^{- Λ_{λ} t} \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}} [\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} (κ_{A}^{2} - κ_{H}^{2} - 2 Λ_{λ} (κ_{A} - κ_{H})) - {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \frac{κ_{A}^{2} - κ_{H}^{2}}{2}]}, \end{matrix}

(A78)

\begin{matrix} \frac{\partial L_{λ}^{(2)} (t)}{\partial λ} = \frac{1}{4} {\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} \cdot {(\frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}})}^{2} \cdot (κ_{A}^{2} - κ_{H}^{2} - 2 Λ_{λ} (κ_{A} - κ_{H}) - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} (κ_{A}^{2} - κ_{H}^{2})) \\ + t \cdot e^{- Λ_{λ} t} \cdot {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \cdot \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}} \cdot (κ_{A}^{2} - κ_{H}^{2})}, \end{matrix}

(A79)

\begin{matrix} \frac{\partial U_{λ}^{(1)} (t)}{\partial λ} = \frac{1}{σ^{2}} {\frac{Λ_{λ} - κ_{λ}}{2 Λ_{λ}} [t e^{- Λ_{λ} t} (κ_{A}^{2} - κ_{H}^{2}) - \frac{t}{2} e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} (κ_{A}^{2} - κ_{H}^{2} + 2 Λ_{λ} (κ_{A} - κ_{H}))] \\ - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{2 Λ_{λ}} \cdot (κ_{A}^{2} - κ_{H}^{2} - 2 Λ_{λ} (κ_{A} - κ_{H})) \\ + {(\frac{Λ_{λ} - κ_{λ}}{2 Λ_{λ}})}^{2} [\frac{t}{2} e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} (κ_{A}^{2} - κ_{H}^{2} + 2 Λ_{λ} (κ_{A} - κ_{H})) \\ - \frac{t}{2} e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t} (3 (κ_{A}^{2} - κ_{H}^{2}) + 2 Λ_{λ} (κ_{A} - κ_{H})) \\ + e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} \cdot \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}} \cdot (κ_{A}^{2} - κ_{H}^{2})] \\ + \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} (κ_{A}^{2} - κ_{H}^{2} - 2 Λ_{λ} (κ_{A} - κ_{H})) [\frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} (1 - e^{- Λ_{λ} t})}{2 Λ_{λ}}]}, \end{matrix}

(A80)

\begin{matrix} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ} & = & \frac{{(Λ_{λ} - κ_{λ})}^{2}}{Λ_{λ} (3 Λ_{λ} + κ_{λ})} [\frac{t}{2} e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t} (3 \frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} + κ_{A} - κ_{H}) \\ - \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} \cdot (3 \frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} + κ_{A} - κ_{H})] \\ + \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} [\frac{t}{2} e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} (\frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} + κ_{A} - κ_{H}) - t e^{- Λ_{λ} t} \frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}}] \\ + \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{Λ_{λ}} (\frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} - κ_{A} + κ_{H}) \\ + [2 (\frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} - κ_{A} + κ_{H}) - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}^{2}} \cdot \frac{κ_{A}^{2} - κ_{H}^{2}}{2}] \\ \cdot \frac{1}{Λ_{λ}} [\frac{Λ_{λ} - κ_{λ}}{3 Λ_{λ} + κ_{λ}} (1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}) - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} + e^{- Λ_{λ} t}] . \end{matrix}

(A81)

If

κ_{A} \in] 0, \infty [

and

κ_{H} \in [0, \infty [

with

κ_{A} \neq κ_{H}

, then one gets

{lim}_{λ ↗ 1} Λ_{λ} = {lim}_{λ ↗ 1} κ_{λ} = κ_{A} > 0

which implies (A74) from (A78) to (A81). For the proof of part (b), let us correspondingly assume

κ_{A} = 0

and

κ_{H} \in] 0, \infty [

, which by (150) leads to

κ_{λ} = κ_{H} \cdot (1 - λ)

,

Λ_{λ} = κ_{H} \cdot \sqrt{1 - λ}

and the convergences

{lim}_{λ ↗ 1} Λ_{λ} = {lim}_{λ ↗ 1} κ_{λ} = 0

. From this, the assertions (A75), (A76), (A77) follow in a straightforward manner from (A78), (A79), (A80) – respectively – by using (parts of) the obvious relations

lim_{λ ↗ 1} \frac{κ_{λ}}{Λ_{λ}} = 0, lim_{λ ↗ 1} \frac{Λ_{λ} \pm κ_{λ}}{Λ_{λ}} = lim_{λ ↗ 1} \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} + κ_{λ}} = 1,

(A82)

lim_{λ ↗ 1} \frac{1 - e^{- c_{λ} \cdot t}}{c_{λ}} = t for all c_{λ} \in \{Λ_{λ}, \frac{Λ_{λ} + κ_{λ}}{2}, \frac{3 Λ_{λ} + κ_{λ}}{2}\} .

(A83)

In order to get the last assertion in (A77), we make use of the following limits

lim_{λ ↗ 1} \frac{1}{Λ_{λ} - κ_{λ}} - \frac{3}{3 Λ_{λ} + κ_{λ}} = lim_{λ ↗ 1} \frac{4 κ_{H}}{(κ_{H} - κ_{H} \cdot \sqrt{1 - λ}) \cdot (3 κ_{H} + κ_{H} \cdot \sqrt{1 - λ})} = \frac{4}{3 κ_{H}}

(A84)

and

lim_{λ ↗ 1} \frac{1}{Λ_{λ}} [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} - \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} + \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{Λ_{λ} - κ_{λ}}] = 0 .

(A85)

To see (A85), let us first observe that the involved limit can be rewritten as

\begin{matrix} lim_{λ ↗ 1} {\frac{1}{Λ_{λ} (Λ_{λ} - κ_{λ})} [\frac{1}{3} - \frac{1}{3} e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t} + e^{- Λ_{λ} t} - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}] \end{matrix}

(A86)

\begin{matrix} + \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{Λ_{λ}} [\frac{1}{3 Λ_{λ} + κ_{λ}} - \frac{1}{3 (Λ_{λ} - κ_{λ})}]} . \end{matrix}

(A87)

Substituting

x : = \sqrt{1 - λ}

and applying l’Hospital’s rule twice, we get for the first limit (A86)

\begin{matrix} lim_{x ↘ 0} \frac{\frac{1}{3} - \frac{1}{3} e^{- \frac{κ_{H} t}{2} (3 x + x^{2})} + e^{- κ_{H} t x} - e^{- \frac{κ_{H} t}{2} (x + x^{2})}}{κ_{H}^{2} \cdot (x^{2} - x^{3})} \\ = lim_{x ↘ 0} \frac{\frac{κ_{H} t}{6} (3 + 2 x) e^{- \frac{κ_{H} t}{2} (3 x + x^{2})} - κ_{H} t e^{- κ_{H} t x} + \frac{κ_{H} t}{2} (1 + 2 x) e^{- \frac{κ_{H} t}{2} (x + x^{2})}}{κ_{H}^{2} \cdot (2 x - 3 x^{2})} \\ = lim_{x ↘ 0} \frac{[- \frac{κ_{H}^{2} t^{2}}{12} {(3 + 2 x)}^{2} + \frac{κ_{H} t}{3}] e^{- \frac{κ_{H} t}{2} (3 x + x^{2})} + κ_{H}^{2} t^{2} e^{- κ_{H} t x} - [\frac{κ_{H}^{2} t^{2}}{4} {(1 + 2 x)}^{2} - κ_{H} t] e^{- \frac{κ_{H} t}{2} (x + x^{2})}}{κ_{H}^{2} \cdot (2 - 6 x)} \\ = \frac{1}{2 κ_{H}^{2}} [- \frac{3 κ_{H}^{2} t^{2}}{4} + \frac{κ_{H} t}{3} + κ_{H}^{2} t^{2} - \frac{κ_{H}^{2} t^{2}}{4} + κ_{H} t] = \frac{2 t}{3 κ_{H}} . \end{matrix}

The second limit (A87) becomes

\begin{matrix} lim_{λ ↗ 1} \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} \cdot \frac{3 Λ_{λ} + κ_{λ}}{Λ_{λ}} \cdot \frac{- 4 κ_{H}}{(3 κ_{H} + \sqrt{1 - λ} κ_{H}) (3 κ_{H} - 3 \sqrt{1 - λ} κ_{H})} \end{matrix}

(A88)

and consequently (A85) follows. To proceed with the proof of (A77), we rearrange

\begin{matrix} lim_{λ ↗ 1} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ} = lim_{λ ↗ 1} {{(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} [\frac{Λ_{λ}}{3 Λ_{λ} + κ_{λ}} (\frac{t}{2} e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t} (- \frac{3 κ_{H}^{2}}{2 Λ_{λ}} - κ_{H})) \\ - \frac{Λ_{λ}}{3 Λ_{λ} + κ_{λ}} \cdot \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} (- \frac{3 κ_{H}^{2}}{2 Λ_{λ}} - κ_{H}) + \frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} (- \frac{κ_{H}^{2}}{2 Λ_{λ}} + κ_{H}) \\ - \frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} (- \frac{t}{2} e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} (- \frac{κ_{H}^{2}}{2 Λ_{λ}} - κ_{H}) - t e^{- Λ_{λ} t} \frac{κ_{H}^{2}}{2 Λ_{λ}})] \\ + [\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} (- κ_{H}^{2} + 2 Λ_{λ} κ_{H}) + {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \frac{κ_{H}^{2}}{2}] \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{Λ_{λ} (3 Λ_{λ} + κ_{λ})} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{Λ_{λ} (Λ_{λ} - κ_{λ})}]} \\ = lim_{λ ↗ 1} {{(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} [\frac{κ_{H}^{2} t}{4} (- \frac{3 e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{Λ_{λ} - κ_{λ}} + \frac{2 e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}}) \end{matrix}

(A89)

\begin{matrix} + \frac{κ_{H}^{2}}{2} (\frac{3 (1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t})}{{(3 Λ_{λ} + κ_{λ})}^{2}} - \frac{1 - e^{- Λ_{λ} t}}{{(Λ_{λ} - κ_{λ})}^{2}} + \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{{(Λ_{λ} - κ_{λ})}^{2}}) \end{matrix}

(A90)

\begin{matrix} + κ_{H} (- \frac{Λ_{λ}}{3 Λ_{λ} + κ_{λ}} \cdot \frac{t e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{2} + \frac{Λ_{λ}}{3 Λ_{λ} + κ_{λ}} \cdot \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} - \frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} \cdot \frac{t e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{2} \\ + \frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} \cdot \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} - \frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} \cdot \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{Λ_{λ} - κ_{λ}})] \\ + [\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} (- κ_{H}^{2} + 2 Λ_{λ} κ_{H}) + {(\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}})}^{2} \frac{κ_{H}^{2}}{2}] \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{Λ_{λ} (3 Λ_{λ} + κ_{λ})} - \frac{e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t} - e^{- Λ_{λ} t}}{Λ_{λ} (Λ_{λ} - κ_{λ})}]} . \end{matrix}

(A91)

By means of (A82) to (A84), the limit of the expression after the squared brackets in (A89) becomes

lim_{λ ↗ 1} {\frac{κ_{H}^{2} t}{4} [\frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{Λ_{λ} - κ_{λ}} - 2 \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} + 3 \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} + \frac{1}{Λ_{λ} - κ_{λ}} - \frac{3}{3 Λ_{λ} + κ_{λ}}] = \frac{κ_{H} t}{3},

(A92)

and the limit of the expression in (A90) becomes with (A85)

\begin{matrix} lim_{λ ↗ 1} {\frac{Λ_{λ}}{Λ_{λ} - κ_{λ}} \cdot \frac{κ_{H}^{2}}{2 Λ_{λ}} \cdot [\frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} - \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ} - κ_{λ}} + \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) t}}{Λ_{λ} - κ_{λ}}] \\ - \frac{κ_{H}^{2}}{2} \cdot \frac{1 - e^{- \frac{1}{2} (3 Λ_{λ} + κ_{λ}) t}}{3 Λ_{λ} + κ_{λ}} \cdot [\frac{1}{Λ_{λ} - κ_{λ}} - \frac{3}{3 Λ_{λ} + κ_{λ}}] = - \frac{κ_{H} t}{3} . \end{matrix}

(A93)

By putting (A91)–(A93) together with (A85) we finally end up with

lim_{λ ↗ 1} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ} = [\frac{κ_{H} t}{3} - \frac{κ_{H} t}{3}] + κ_{H} (- \frac{t}{6} + \frac{t}{6} - \frac{t}{2} + t - \frac{t}{2}) + [- κ_{H}^{2} + \frac{κ_{H}^{2}}{2}] \cdot 0 = 0,

which finishes the proof of Lemma A7. □

Proof of Theorem 12.

Recall from (131) the approximative Poisson offspring-distribution parameter

β_{•}^{(m)} : = 1 - \frac{κ_{•}}{σ^{2} m}

and Poisson immigration-distribution parameter

α_{•}^{(m)} : = β_{•}^{(m)} \cdot \frac{η}{σ^{2}}

, which is a special case of

(β_{A}^{(m)}, β_{H}^{(m)}, α_{A}^{(m)}, α_{H}^{(m)}) \in P_{NI} \cup P_{SP, 1}

. Let us first calculate

{lim}_{m \to \infty} I (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})

by starting from Theorem 3(a). Correspondingly, we evaluate for all

κ_{A} \geq 0

,

κ_{H} \geq 0

with

κ_{A} \neq κ_{H}

by a twofold application of l’Hospital’s rule

\begin{matrix} lim_{m \to \infty} m^{2} \cdot [β_{A}^{(m)} \cdot (log (\frac{β_{A}^{(m)}}{β_{H}^{(m)}}) - 1) + β_{H}^{(m)}] = lim_{m \to \infty} \frac{- m}{2 σ^{2}} [κ_{A} log (\frac{β_{A}^{(m)}}{β_{H}^{(m)}}) + κ_{H} (1 - \frac{β_{A}^{(m)}}{β_{H}^{(m)}})] \\ = \frac{1}{2 σ^{4}} \cdot lim_{m \to \infty} \frac{β_{H}^{(m)} \cdot κ_{A} - β_{A}^{(m)} \cdot κ_{H}}{{(β_{H}^{(m)})}^{2}} \cdot (κ_{A} \cdot \frac{β_{H}^{(m)}}{β_{A}^{(m)}} - κ_{H}) = \frac{{(κ_{A} - κ_{H})}^{2}}{2 σ^{4}} . \end{matrix}

(A94)

Additionally there holds

lim_{m \to \infty} m \cdot (1 - β_{A}^{(m)}) = \frac{κ_{A}}{σ^{2}} and lim_{m \to \infty} {(β_{A}^{(m)})}^{⌊σ^{2} m t⌋} = lim_{m \to \infty} {[{(1 - \frac{κ_{A}}{σ^{2} m})}^{m}]}^{⌊σ^{2} m t⌋ / m} = e^{- κ_{A} \cdot t} .

(A95)

For

κ_{A} > 0

, we apply the upper part of formula (69) as well as (A94) and (A95) to derive

\begin{matrix} lim_{m \to \infty} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = lim_{m \to \infty} [\frac{m^{2} \cdot [β_{A}^{(m)} \cdot (log (\frac{β_{A}^{(m)}}{β_{H}^{(m)}}) - 1) + β_{H}^{(m)}]}{m \cdot (1 - β_{A}^{(m)})} \\ \cdot [\frac{X_{0}^{(m)}}{m} - \frac{α_{A}^{(m)}}{m \cdot (1 - β_{A}^{(m)})}] \cdot (1 - {(β_{A}^{(m)})}^{⌊σ^{2} m t⌋}) \\ + \frac{α_{A}^{(m)}}{β_{A}^{(m)} \cdot m \cdot (1 - β_{A}^{(m)})} \cdot m^{2} \cdot [β_{A}^{(m)} \cdot (log (\frac{β_{A}^{(m)}}{β_{H}^{(m)}}) - 1) + β_{H}^{(m)}] \cdot \frac{⌊σ^{2} m t⌋}{m}] \\ = \frac{{(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}} \cdot [({\tilde{X}}_{0} - \frac{η}{κ_{A}}) \cdot (1 - e^{- κ_{A} \cdot t}) + η \cdot t] . \end{matrix}

For

κ_{A} = 0

(and thus

κ_{H} > 0

,

β_{A}^{(m)} \equiv 1

,

α_{A}^{(m)} \equiv η / σ^{2}

), we apply the lower part of formula (69) as well as (A94) and (A95) to obtain

\begin{matrix} lim_{m \to \infty} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = {lim_{m \to \infty} m^{2} \cdot [β_{H}^{(m)} - log β_{H}^{(m)} - 1] \\ \cdot [\frac{η}{2 σ^{2}} \cdot \frac{{(⌊σ^{2} m t⌋)}^{2}}{m^{2}} + (\frac{X_{0}^{(m)}}{m} + \frac{η}{2 σ^{2} \cdot m}) \cdot \frac{⌊σ^{2} m t⌋}{m}]} = \frac{κ_{H}^{2}}{2 σ^{2}} \cdot [\frac{η}{2} \cdot t^{2} + {\tilde{X}}_{0} \cdot t] . \end{matrix}

Let us now calculate the “converse” double limit

lim_{λ ↗ 1} lim_{m \to \infty} I_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)}) = lim_{λ ↗ 1} lim_{m \to \infty} \frac{1 - H_{λ} (P_{A, ⌊σ^{2} m t⌋}^{(m)} ∥ P_{H, ⌊σ^{2} m t⌋}^{(m)})}{λ \cdot (1 - λ)} .

This will be achieved by evaluating for each

t > 0

the two limits

lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{L}}{λ \cdot (1 - λ)} and lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{U}}{λ \cdot (1 - λ)}

(A96)

which will turn out to coincide; the involved lower and upper bound

d_{λ, {\tilde{X}}_{0}, t}^{L}

,

d_{λ, {\tilde{X}}_{0}, t}^{U}

defined by (153) and (154) satisfy

{lim}_{λ ↗ 1} d_{λ, {\tilde{X}}_{0}, t}^{L} = {lim}_{λ ↗ 1} d_{λ, {\tilde{X}}_{0}, t}^{U} = 1

as an easy consequence of the limits (cf. 150)

lim_{λ ↗ 1} Λ_{λ} = κ_{A} \geq 0 and lim_{λ ↗ 1} κ_{λ} = κ_{A} \geq 0,

(A97)

as well as the formulas (A82) and (A83) for the case

κ_{A} = 0

. Accordingly, we compute

\begin{matrix} lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{L}}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} \frac{- d_{λ, {\tilde{X}}_{0}, t}^{L}}{1 - 2 λ} \frac{\partial}{\partial λ} [- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \cdot [{\tilde{X}}_{0} - \frac{η}{Λ_{λ}}] \cdot (1 - e^{- Λ_{λ} \cdot t}) - \frac{η}{σ^{2}} \cdot (Λ_{λ} - κ_{λ}) \cdot t \\ + L_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} + \frac{η}{σ^{2}} \cdot L_{λ}^{(2)} (t)] \\ = lim_{λ ↗ 1} {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [({\tilde{X}}_{0} - \frac{η}{Λ_{λ}}) \cdot t e^{- Λ_{λ} \cdot t} \cdot \frac{\partial Λ_{λ}}{\partial λ} + (1 - e^{- Λ_{λ} \cdot t}) \cdot \frac{η}{Λ_{λ}^{2}} \cdot \frac{\partial Λ_{λ}}{\partial λ}] \\ - \frac{1}{σ^{2}} \cdot \frac{\partial}{\partial λ} (Λ_{λ} - κ_{λ}) \cdot ({\tilde{X}}_{0} - \frac{η}{Λ_{λ}}) \cdot (1 - e^{- Λ_{λ} \cdot t}) - \frac{η t}{σ^{2}} \cdot \frac{\partial}{\partial λ} (Λ_{λ} - κ_{λ}) \\ + {\tilde{X}}_{0} \frac{\partial L_{λ}^{(1)} (t)}{\partial λ} + \frac{η}{σ^{2}} \frac{\partial L_{λ}^{(2)} (t)}{\partial λ}}, with \end{matrix}

(A98)

\frac{\partial Λ_{λ}}{\partial λ} = \frac{κ_{A}^{2} - κ_{H}^{2}}{2 Λ_{λ}} and \frac{\partial κ_{λ}}{\partial λ} = κ_{A} - κ_{H} .

(A99)

For the case

κ_{A} > 0

, one can combine this with (A97) and (A74) to end up with

lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{L}}{λ \cdot (1 - λ)} = \frac{{(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}} \cdot [({\tilde{X}}_{0} - \frac{η}{κ_{A}}) \cdot (1 - e^{- κ_{A} \cdot t}) + η \cdot t] .

(A100)

For the case

κ_{A} = 0

, we continue the calculation (A98) by rearranging terms and by employing the Formulas (A75), (A76), (A82) and (A83) as well as the obvious relation

\frac{1}{Λ} - \frac{Λ - κ_{λ}}{Λ^{2}} = \frac{1}{κ_{H}}

and obtain

\begin{matrix} lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{L}}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} {\frac{κ_{H}^{2} \cdot {\tilde{X}}_{0}}{2 σ^{2}} [\frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} \cdot t \cdot e^{- Λ_{λ} t} + \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}}] \\ + \frac{η \cdot κ_{H}^{2} \cdot t}{2 σ^{2}} [\frac{1}{Λ_{λ}} - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}^{2}} + \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} \cdot \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}}] - \frac{η \cdot κ_{H}^{2}}{2 σ^{2}} \cdot \frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}} [\frac{1}{Λ_{λ}} - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}^{2}}] \\ - \frac{κ_{H} \cdot {\tilde{X}}_{0}}{σ^{2}} (1 - e^{- Λ_{λ} t}) + \frac{η \cdot κ_{H}}{σ^{2}} [\frac{1 - e^{- Λ_{λ} t}}{Λ_{λ}} - t] + \frac{\partial L_{λ}^{(1)} (t)}{\partial λ} \cdot {\tilde{X}}_{0} + \frac{η}{σ^{2}} \cdot \frac{\partial L_{λ}^{(2)} (t)}{\partial λ}} \\ = \frac{κ_{H}^{2} {\tilde{X}}_{0} t}{σ^{2}} + \frac{η κ_{H}^{2} t}{2 σ^{2}} [\frac{1}{κ_{H}} + t] - \frac{η κ_{H} t}{2 σ^{2}} - \frac{κ_{H}^{2} {\tilde{X}}_{0} t}{2 σ^{2}} - \frac{η κ_{H}^{2} t^{2}}{4 σ^{2}} = \frac{κ_{H}^{2}}{2 σ^{2}} \cdot [\frac{η}{2} \cdot t^{2} + {\tilde{X}}_{0} \cdot t] . \end{matrix}

(A101)

Let us now turn to the second limit (A96) for which we compute analogously to (A98)

\begin{matrix} lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{U}}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} \frac{- d_{λ, {\tilde{X}}_{0}, t}^{U}}{1 - 2 λ} \frac{\partial}{\partial λ} [- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} \cdot [{\tilde{X}}_{0} - \frac{η}{\frac{1}{2} (Λ_{λ} + κ_{λ})}] \cdot (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}) \\ - \frac{η}{σ^{2}} \cdot (Λ_{λ} - κ_{λ}) \cdot t - U_{λ}^{(1)} (t) \cdot {\tilde{X}}_{0} - \frac{η}{σ^{2}} \cdot U_{λ}^{(2)} (t)] \\ = lim_{λ ↗ 1} {- \frac{Λ_{λ} - κ_{λ}}{σ^{2}} [({\tilde{X}}_{0} - \frac{η}{\frac{1}{2} (Λ_{λ} + κ_{λ})}) \cdot \frac{t}{2} \cdot e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} \frac{\partial}{\partial λ} (Λ_{λ} + κ_{λ}) \\ + (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}) \cdot \frac{2 \cdot η}{{(Λ_{λ} + κ_{λ})}^{2}} \cdot \frac{\partial}{\partial λ} (Λ_{λ} + κ_{λ})] \\ - \frac{1}{σ^{2}} \cdot \frac{\partial}{\partial λ} (Λ_{λ} - κ_{λ}) \cdot ({\tilde{X}}_{0} - \frac{η}{\frac{1}{2} (Λ_{λ} + κ_{λ})}) \cdot (1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}) - \frac{η t}{σ^{2}} \cdot \frac{\partial}{\partial λ} (Λ_{λ} - κ_{λ}) \\ - \frac{\partial U_{λ}^{(1)} (t)}{\partial λ} \cdot {\tilde{X}}_{0} - \frac{η}{σ^{2}} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ}} . \end{matrix}

(A102)

For the case

κ_{A} > 0

, one can combine this with (A97), (A99) and (A74) to end up with

lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{U}}{λ \cdot (1 - λ)} = \frac{{(κ_{A} - κ_{H})}^{2}}{2 σ^{2} \cdot κ_{A}} \cdot [({\tilde{X}}_{0} - \frac{η}{κ_{A}}) \cdot (1 - e^{- κ_{A} \cdot t}) + η \cdot t] .

(A103)

For the case

κ_{A} = 0

, we continue the calculation of (A102) by rearranging terms and by employing the formulas (A77), (A82) and (A83) as well as the obvious relation

{lim}_{λ ↗ 1} \frac{1}{Λ_{λ}} - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} (Λ_{λ} + κ_{λ})} = \frac{2}{κ_{H}}

to obtain

\begin{matrix} lim_{λ ↗ 1} \frac{1 - d_{λ, {\tilde{X}}_{0}, t}^{U}}{λ \cdot (1 - λ)} = lim_{λ ↗ 1} {\frac{t \cdot {\tilde{X}}_{0}}{4 σ^{2}} \cdot \frac{Λ_{λ} - κ_{λ}}{Λ_{λ}} \cdot e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} (κ_{H}^{2} + 2 Λ_{λ} κ_{H}) \\ + \frac{{\tilde{X}}_{0}}{2 σ^{2}} \cdot \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ}} (κ_{H}^{2} - 2 Λ_{λ} κ_{H}) - \frac{η \cdot t}{σ^{2}} [κ_{H} (1 + e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t} \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} + κ_{λ}}) \\ - \frac{κ_{H}^{2}}{2} \cdot (\frac{1}{Λ_{λ}} - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} (Λ_{λ} + κ_{λ})} + \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} + κ_{λ}} \cdot \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ}})] \\ + \frac{2 η}{σ^{2}} \cdot \frac{1 - e^{- \frac{1}{2} (Λ_{λ} + κ_{λ}) \cdot t}}{Λ_{λ} + κ_{λ}} [κ_{H} (1 + \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} + κ_{λ}}) - \frac{κ_{H}^{2}}{2} (\frac{1}{Λ_{λ}} - \frac{Λ_{λ} - κ_{λ}}{Λ_{λ} (Λ_{λ} + κ_{λ})})] \\ - \frac{\partial U_{λ}^{(1)} (t)}{\partial λ} \cdot {\tilde{X}}_{0} - \frac{η}{σ^{2}} \frac{\partial U_{λ}^{(2)} (t)}{\partial λ}} \\ = \frac{κ_{H}^{2} t {\tilde{X}}_{0}}{4 σ^{2}} + \frac{κ_{H}^{2} t {\tilde{X}}_{0}}{4 σ^{2}} - \frac{η t}{σ^{2}} [2 κ_{H} - κ_{H} - \frac{κ_{H}^{2} t}{4}] + \frac{η t}{σ^{2}} [2 κ_{H} - κ_{H}] = \frac{κ_{H}^{2}}{2 σ^{2}} [\frac{η}{2} \cdot t^{2} + {\tilde{X}}_{0} \cdot t] . \end{matrix}

(A104)

Since (A100) coincides with (A103) and (A101) coincides with (A104), we have finished the proof. □

References

Liese, F.; Vajda, I. Convex Statistical Distances; Teubner: Leipzig, Germany, 1987. [Google Scholar]
Read, T.R.C.; Cressie, N.A.C. Goodness-of-Fit Statistics for Discrete Multivariate Data; Springer: New York, NY, USA, 1988. [Google Scholar]
Vajda, I. Theory of Statistical Inference and Information; Kluwer: Dordrecht, The Netherlands, 1989. [Google Scholar]
Csiszár, I.; Shields, P.C. Information Theory and Statistics: A Tutorial; Now Publishers: Hanover, MA, USA, 2004. [Google Scholar]
Stummer, W. Exponentials, Diffusions, Finance, Entropy and Information; Shaker: Aachen, Germany, 2004. [Google Scholar]
Pardo, L. Statistical Inference Based on Divergence Measures; Chapman & Hall/CRC: Bocan Raton, FL, USA, 2006. [Google Scholar]
Liese, F.; Miescke, K.J. Statistical Decision Theory: Estimation, Testing, and Selection; Springer: New York, NY, USA, 2008. [Google Scholar]
Basu, A.; Shioya, H.; Park, C. Statistical Inference: The Minimum Distance Approach; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
Voinov, V.; Nikulin, M.; Balakrishnan, N. Chi-Squared Goodness of Fit Tests with Applications; Academic Press: Waltham, MA, USA, 2013. [Google Scholar]
Liese, F.; Vajda, I. On divergences and informations in statistics and information theory. IEEE Trans. Inform. Theory 2006, 52, 4394–4412. [Google Scholar]
Vajda, I.; van der Meulen, E.C. Goodness-of-fit criteria based on observations quantized by hypothetical and empirical percentiles. In Handbook of Fitting Statistical Distributions with R; Karian, Z.A., Dudewicz, E.J., Eds.; CRC: Heidelberg, Germany, 2010; pp. 917–994. [Google Scholar]
Stummer, W.; Vajda, I. On Bregman distances and divergences of probability measures. IEEE Trans. Inform. Theory 2012, 58, 1277–1288. [Google Scholar]
Kißlinger, A.-L.; Stummer, W. Robust statistical engineering by means of scaled Bregman distances. In Recent Advances in Robust Statistics–Theory and Applications; Agostinelli, C., Basu, A., Filzmoser, P., Mukherjee, D., Eds.; Springer: New Delhi, India, 2016; pp. 81–113. [Google Scholar]
Broniatowski, M.; Stummer, W. Some universal insights on divergences for statistics, machine learning and artificial intelligence. In Geometric Structures of Information; Nielsen, F., Ed.; Springer: Cham, Switzerland, 2019; pp. 149–211. [Google Scholar]
Stummer, W.; Vajda, I. Optimal statistical decisions about some alternative financial models. J. Econom. 2007, 137, 441–471. [Google Scholar]
Stummer, W.; Lao, W. Limits of Bayesian decision related quantities of binomial asset price models. Kybernetika 2012, 48, 750–767. [Google Scholar]
Csiszar, I. Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Publ. Math. Inst. Hungar. Acad. Sci. 1963, A-8, 85–108. [Google Scholar]
Ali, M.S.; Silvey, D. A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. B 1966, 28, 131–140. [Google Scholar]
Morimoto, T. Markov processes and the H-theorem. J. Phys. Soc. Jpn 1963, 18, 328–331. [Google Scholar]
van Erven, T.; Harremoes, P. Renyi divergence and Kullback-Leibler divergence. IEEE Trans. Inf. Theory 2014, 60, 3797–3820. [Google Scholar]
Newman, C.M. On the orthogonality of independent increment processes. In Topics in Probability Theory; Courant Institute of Mathematical Sciences New York University: New York, NY, USA, 1973; pp. 93–111. [Google Scholar]
Liese, F. Hellinger integrals of Gaussian processes with independent increments. Stochastics 1982, 6, 81–96. [Google Scholar]
Memin, J.; Shiryayev, A.N. Distance de Hellinger-Kakutani des lois correspondant a deux processus a accroissements indépendants. Probab. Theory Relat. Fields 1985, 70, 67–89. [Google Scholar]
Jacod, J.; Shiryaev, A.N. Limit Theorems for Stochastic Processes; Springer: Berlin, Germany, 1987. [Google Scholar]
Linkov, Y.N.; Shevlyakov, Y.A. Large deviation theorems in the hypotheses testing problems for processes with independent increments. Theory Stoch. Process 1998, 4, 198–210. [Google Scholar]
Liese, F. Hellinger integrals, error probabilities and contiguity of Gaussian processes with independent increments and Poisson processes. J. Inf. Process. Cybern. 1985, 21, 297–313. [Google Scholar]
Kabanov, Y.M.; Liptser, R.S.; Shiryaev, A.N. On the variation distance for probability measures defined on a filtered space. Probab. Theory Relat. Fields 1986, 71, 19–35. [Google Scholar]
Liese, F. Hellinger integrals of diffusion processes. Statistics 1986, 17, 63–78. [Google Scholar]
Vajda, I. Distances and discrimination rates for stochastic processes. Stoch. Process. Appl. 1990, 35, 47–57. [Google Scholar]
Stummer, W. The Novikov and entropy conditions of multidimensional diffusion processes with singular drift. Probab. Theory Relat. Fields 1993, 97, 515–542. [Google Scholar]
Stummer, W. On a statistical information measure of diffusion processes. Stat. Decis. 1999, 17, 359–376. [Google Scholar]
Stummer, W. On a statistical information measure for a generalized Samuelson-Black-Scholes model. Stat. Decis. 2001, 19, 289–314. [Google Scholar]
Bartoszynski, R. Branching processes and the theory of epidemics. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. IV; Le Cam, L.M., Neyman, J., Eds.; University of California Press: Berkeley, CA, USA, 1967; pp. 259–269. [Google Scholar]
Ludwig, D. Qualitative behaviour of stochastic epidemics. Math. Biosci. 1975, 23, 47–73. [Google Scholar]
Becker, N.G. Estimation for an epidemic model. Biometrics 1976, 32, 769–777. [Google Scholar]
Becker, N.G. Estimation for discrete time branching processes with applications to epidemics. Biometrics 1977, 33, 515–522. [Google Scholar]
Metz, J.A.J. The epidemic in a closed population with all susceptibles equally vulnerable; some results for large susceptible populations and small initial infections. Acta Biotheor. 1978, 27, 75–123. [Google Scholar]
Heyde, C.C. On assessing the potential severity of an outbreak of a rare infectious disease. Austral. J. Stat. 1979, 21, 282–292. [Google Scholar]
Von Bahr, B.; Martin-Löf, A. Threshold limit theorems for some epidemic processes. Adv. Appl. Prob. 1980, 12, 319–349. [Google Scholar]
Ball, F. The threshold behaviour of epidemic models. J. Appl. Prob. 1983, 20, 227–241. [Google Scholar]
Jacob, C. Branching processes: Their role in epidemics. Int. J. Environ. Res. Public Health 2010, 7, 1186–1204. [Google Scholar]
Barbour, A.D.; Reinert, G. Approximating the epidemic curve. Electron. J. Probab. 2013, 18, 1–30. [Google Scholar]
Britton, T.; Pardoux, E. Stochastic epidemics in a homogeneous community. In Stochastic Epidemic Models; Britton, T., Pardoux, E., Eds.; Springer: Cham, Switzerland, 2019; pp. 1–120. [Google Scholar]
Dion, J.P.; Gauthier, G.; Latour, A. Branching processes with immigration and integer-valued time series. Serdica Math. J. 1995, 21, 123–136. [Google Scholar]
Grunwald, G.K.; Hyndman, R.J.; Tedesco, L.; Tweedie, R.L. Non-Gaussian conditional linear AR(1) models. Aust. N. Z. J. Stat. 2000, 42, 479–495. [Google Scholar]
Kedem, B.; Fokianos, K. An Regression Models for Time Series Analysis; Wiley: Hoboken, NJ, USA, 2002. [Google Scholar]
Held, L.; Höhle, M.; Hofmann, M. A statistical framework for the analysis of multivariate infectious disease surveillance counts. Stat. Model. 2005, 5, 187–199. [Google Scholar]
Weiss, C.H. An Introduction to Discrete-Valued Time Series; Wiley: Hoboken, NJ, USA, 2018. [Google Scholar]
Feigin, P.D.; Passy, U. The geometric programming dual to the extinction probability problem in simple branching processes. Ann. Probab. 1981, 9, 498–503. [Google Scholar]
Mordecki, E. Asymptotic mixed normality and Hellinger processes. Stoch. Stoch. Rep. 1994, 48, 129–143. [Google Scholar]
Sriram, T.N.; Vidyashankar, A.N. Minimum Hellinger distance estimation for supercritical Galton-Watson processes. Stat. Probab. Lett. 2000, 50, 331–342. [Google Scholar]
Guttorp, P. Statistical Inference for Branching Processes; Wiley: New York, NY, USA, 1991. [Google Scholar]
Linkov, Y.N.; Lunyova, L.A. Large deviation theorems in the hypothesis testing problems for the Galton-Watson processes with immigration. Theory Stoch. Process 1996, 2, 120–132, Erratum in Theory Stoch. Process 1997, 3, 270–285. [Google Scholar]
Heathcote, C.R. A branching process allowing immigration. J. R. Stat. Soc. B 1965, 27, 138–143, Erratum in: Heathcote, C.R. Corrections and comments on the paper “A branching process allowing immigration”. J. R. Stat. Soc. B 1966, 28, 213–217. [Google Scholar]
Athreya, K.B.; Ney, P.E. Branching Processes; Springer: New York, NY, USA, 1972. [Google Scholar]
Jagers, P. Branching Processes with Biological Applications; Wiley: London, UK, 1975. [Google Scholar]
Asmussen, S.; Hering, H. Branching Processes; Birkhäuser: Boston, MA, USA, 1983. [Google Scholar]
Haccou, P.; Jagers, P.; Vatutin, V.A. Branching Processes: Variation, Growth, and Extinction of Populations; Cambrigde University Press: Cambridge, UK, 2005. [Google Scholar]
Heyde, C.C.; Seneta, E. Estimation theory for growth and immigration rates in a multiplicative process. J. Appl. Probab. 1972, 9, 235–256. [Google Scholar]
Basawa, I.V.; Rao, B.L.S. Statistical Inference of Stochastic Processes; Academic Press: London, UK, 1980. [Google Scholar]
Basawa, I.V.; Scott, D.J. Asymptotic Optimal Inference for Non-Ergodic Models; Springer: New York, NY, USA, 1983. [Google Scholar]
Sankaranarayanan, G. Branching Processes and Its Estimation Theory; Wiley: New Delhi, India, 1989. [Google Scholar]
Wei, C.Z.; Winnicki, J. Estimation of the means in the branching process with immigration. Ann. Stat. 1990, 18, 1757–1773. [Google Scholar]
Winnicki, J. Estimation of the variances in the branching process with immigration. Probab. Theory Relat. Fields 1991, 88, 77–106. [Google Scholar]
Yanev, N.M. Statistical inference for branching processes. In Records and Branching Processes; Ahsanullah, M., Yanev, G.P., Eds.; Nova Science Publishes: New York, NY, USA, 2008; pp. 147–172. [Google Scholar]
Harris, T.E. The Theory of Branching Processes; Springer: Berlin, Germany, 1963. [Google Scholar]
Gauthier, G.; Latour, A. Convergence forte des estimateurs des parametres d’un processus GENAR(p). Ann. Sci. Math. Que. 1994, 18, 49–71. [Google Scholar]
Latour, A. Existence and stochastic structure of a non-negative integer-valued autoregressive process. J. Time Ser. Anal. 1998, 19, 439–455. [Google Scholar]
Rydberg, T.H.; Shephard, N. BIN models for trade-by-trade data. Modelling the number of trades in a fixed interval of time. In Econometric Society World Congress; Contributed Papers No. 0740; Econometric Society: Cambridge, UK, 2000. [Google Scholar]
Brandt, P.T.; Williams, J.T. A linear Poisson autoregressive model: The Poisson AR(p) model. Polit. Anal. 2001, 9, 164–184. [Google Scholar]
Heinen, A. Modelling time series count data: An autoregressive conditional Poisson model. In Core Discussion Paper; MPRA Paper No. 8113; University of Louvain: Louvain, Belgium, 2003; Volume 62, Available online: https://mpra.ub.uni-muenchen.de/8113 (accessed on 18 May 2020).
Held, L.; Hofmann, M.; Höhle, M.; Schmid, V. A two-component model for counts of infectious diseases. Biostatistics 2006, 7, 422–437. [Google Scholar]
Finkenstädt, B.F.; Bjornstad, O.N.; Grenfell, B.T. A stochastic model for extinction and recurrence of epidemics: Estimation and inference for measles outbreak. Biostatistics 2002, 3, 493–510. [Google Scholar]
Ferland, R.; Latour, A.; Oraichi, D. Integer-valued GARCH process. J. Time Ser. Anal. 2006, 27, 923–942. [Google Scholar]
Weiß, C.H. Modelling time series of counts with overdispersion. Stat. Methods Appl. 2009, 18, 507–519. [Google Scholar]
Weiß, C.H. The INARCH(1) model for overdispersed time series of counts. Comm. Stat. Sim. Comp. 2010, 39, 1269–1291. [Google Scholar]
Weiß, C.H. INARCH(1) processes: Higher-order moments and jumps. Stat. Probab. Lett. 2010, 80, 1771–1780. [Google Scholar]
Weiß, C.H.; Testik, M.C. Detection of abrupt changes in count data time series: Cumulative sum derivations for INARCH(1) models. J. Qual. Technol. 2012, 44, 249–264. [Google Scholar]
Kaslow, R.A.; Evans, A.S. Epidemiologic concepts and methods. In Viral Infections of Humans; Evans, A.S., Kaslow, R.A., Eds.; Springer: New York, NY, USA, 1997; pp. 3–58. [Google Scholar]
Osterholm, M.T.; Hedberg, C.W. Epidemiologic principles. In Mandell, Douglas, and Bennett’s Principles and Practice of Infectious Diseases, 8th ed.; Bennett, J.E., Dolin, R., Blaser, M.J., Eds.; Elsevier: Philadelphia, PA, USA, 2015; pp. 146–157. [Google Scholar]
Grassly, N.C.; Fraser, C. Mathematical models of infectious disease transmission. Nat. Rev. 2008, 6, 477–487. [Google Scholar]
Keeling, M.J.; Rohani, P. Modeling Infectious Diseases in Humans and Animals; Princeton UP: Princeton, NJ, USA, 2008. [Google Scholar]
Yan, P. Distribution theory stochastic processes and infectious disease modelling. In Mathematical Epidemiology; Brauer, F., van den Driessche, P., Wu, J., Eds.; Springer: Berlin, Germany, 2008; pp. 229–293. [Google Scholar]
Yan, P.; Chowell, G. Quantitative Methods for Investigating Infectious Disease Outbreaks; Springer: Cham, Switzerland, 2019. [Google Scholar]
Britton, T. Stochastic epidemic models: A survey. Math. Biosc. 2010, 225, 24–35. [Google Scholar]
Diekmann, O.; Heesterbeek, H.; Britton, T. Mathematical Tools for Understanding Infectious Disease Dynamics; Princeton University Press: Princeton, NJ, USA, 2013. [Google Scholar]
Cummings, D.A.T.; Lessler, J. Infectious disease dynamics. In Infectious Disease Epidemiology: Theory and Practice; Nelson, K.E., Masters Williams, C., Eds.; Jones & Bartlett Learning: Burlington, MA, USA, 2014; pp. 131–166. [Google Scholar]
Just, W.; Callender, H.; Drew LaMar, M.; Toporikova, N. Transmission of infectious diseases: Data, models and simulations. In Algebraic and Discrete Mathematical Methods of Modern Biology; Robeva, R.S., Ed.; Elsevier: London, UK, 2015; pp. 193–215. [Google Scholar]
Britton, T.; Giardina, F. Introduction to statistical inference for infectious diseases. J. Soc. Franc. Stat. 2016, 157, 53–70. [Google Scholar]
Fine, P.E.M. The interval between successive cases of an infectious disease. Am. J. Epidemiol. 2003, 158, 1039–1047. [Google Scholar]
Svensson, A. A note on generation times in epidemic models. Math. Biosci. 2007, 208, 300–311. [Google Scholar]
Svensson, A. The influence of assumptions on generation time distributions in epidemic models. Math. Biosci. 2015, 270, 81–89. [Google Scholar]
Wallinga, J.; Lipsitch, M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc. R. Soc. B 2007, 274, 599–604. [Google Scholar]
Forsberg White, L.; Pagano, M. A likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic. Stat. Med. 2008, 27, 2999–3016. [Google Scholar]
Nishiura, H. Time variations in the generation time of an infectious disease: Implications for sampling to appropriately quantify transmission potential. Math. Biosci. 2010, 7, 851–869. [Google Scholar]
Scalia Tomba, G.; Svensson, A.; Asikainen, T.; Giesecke, J. Some model based considerations on observing generation times for communicable diseases. Math. Biosci. 2010, 223, 24–31. [Google Scholar]
Trichereau, J.; Verret, C.; Mayet, A.; Manet, G. Estimation of the reproductive number for A(H1N1) pdm09 influenza among the French armed forces, September 2009–March 2010. J. Infect. 2012, 64, 628–630. [Google Scholar]
Vink, M.A.; Bootsma, M.C.J.; Wallinga, J. Serial intervals of respiratory infectious diseases: A systematic review and analysis. Am. J. Epidemiol. 2014, 180, 865–875. [Google Scholar]
Champredon, D.; Dushoff, J. Intrinsic and realized generation intervals in infectious-disease transmission. Proc. R. Soc. B 2015, 282, 20152026. [Google Scholar]
An der Heiden, M.; Hamouda, O. Schätzung der aktuellen Entwicklung der SARS-CoV-2-Epidemie in Deutschland— Nowcasting. Epid. Bull. 2020, 17, 10–16. (In Germany) [Google Scholar]
Ferretti, L.; Wymant, C.; Kendall, M.; Zhao, L.; Nurtay, A.; Abeler-Dörner, L.; Parker, M.; Bonsall, D.; Fraser, C. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science 2020, 368, eabb6936. [Google Scholar]
Ganyani, T.; Kremer, C.; Chen, D.; Torneri, A.; Faes, C.; Wallinga, J.; Hens, N. Estimating the generation interval for COVID-19 based on symptom onset data. medRxiv Prepr. 2020. [Google Scholar] [CrossRef] [Green Version]
Li, M.; Liu, K.; Song, Y.; Wang, M.; Wu, J. Serial interval and generation interval for respectively the imported and local infectors estimated using reported contact-tracing data of COVID-19 in China. medRxiv Prepr. 2020. [Google Scholar] [CrossRef]
Nishiura, H.; Linton, N.M.; Akhmetzhanov, A.R. Serial interval of novel coronavirus (COVID-19) infections. medRxiv Prepr. 2020. [Google Scholar] [CrossRef] [Green Version]
Park, M.; Cook, A.R.; Lim, J.J.; Sun, X.; Dickens, B.L. A systematic review of COVID-19 epidemiology based on current evidence. J. Clin. Med. 2020, 9, 967. [Google Scholar] [CrossRef] [Green Version]
Spouge, J.L. An accurate approximation for the expected site frequency spectrum in a Galton-Watson process under an infinite sites mutation model. Theor. Popul. Biol. 2019, 127, 7–15. [Google Scholar]
Taneyhill, D.E.; Dunn, A.M.; Hatcher, M.J. The Galton-Watson branching process as a quantitative tool in parasitology. Parasitol. Today 1999, 15, 159–165. [Google Scholar]
Parnes, D. Analyzing the contagion effect of foreclosures as a branching process: A close look at the years that follow the Great Recession. J. Account. Financ. 2017, 17, 9–34. [Google Scholar]
Le Cam, L. Asymptotic Methods in Statistical Decision Theory; Springer: New York, NY, USA, 1986. [Google Scholar]
Heyde, C.C.; Johnstone, I.M. On asymptotic posterior normality for stochastic processes. J. R. Stat. Soc. B 1979, 41, 184–189. [Google Scholar]
Johnson, R.A.; Susarla, V.; van Ryzin, J. Bayesian non-parametric estimation for age-dependent branching processes. Stoch. Proc. Appl. 1979, 9, 307–318. [Google Scholar]
Scott, D. On posterior asymptotic normality and asymptotic normality of estimators for the Galton-Watson process. J. R. Stat. Soc. B 1987, 49, 209–214. [Google Scholar]
Yanev, N.M.; Tsokos, C.P. Decision-theoretic estimation of the offspring mean in mortal branching processes. Comm. Stat. Stoch. Models 1999, 15, 889–902. [Google Scholar]
Mendoza, M.; Gutierrez-Pena, E. Bayesian conjugate analysis of the Galton-Watson process. Test 2000, 9, 149–171. [Google Scholar]
Feicht, R.; Stummer, W. An explicit nonstationary stochastic growth model. In Economic Growth and Development (Frontiers of Economics and Globalization, Vol. 11); De La Grandville, O., Ed.; Emerald Group Publishing Limited: Bingley, UK, 2011; pp. 141–202. [Google Scholar]
Dorn, F.; Fuest, C.; Göttert, M.; Krolage, C.; Lautenbacher, S.; Link, S.; Peichl, A.; Reif, M.; Sauer, S.; Stöckli, M.; et al. Die volkswirtschaftlichen Kosten des Corona-Shutdown für Deutschland: Eine Szenarienrechnung. ifo Schnelldienst 2020, 73, 29–35. (In Germany) [Google Scholar]
Dorn, F.; Khailaie, S.; Stöckli, M.; Binder, S.; Lange, B.; Peichl, A.; Vanella, P.; Wollmershäuser, T.; Fuest, C.; Meyer-Hermann, M. Das gemeinsame Interesse von Gesundheit und Wirtschaft: Eine Szenarienrechnung zur Eindämmung der Corona-Pandemie. ifo Schnelld. Dig. 2020, 6, 1–9. [Google Scholar]
Kißlinger, A.-L.; Stummer, W. A new toolkit for robust distributional change detection. Appl. Stoch. Models Bus. Ind. 2018, 34, 682–699. [Google Scholar]
Dehning, J.; Zierenberg, J.; Spitzner, F.P.; Wibral, M.; Neto, J.P.; Wilczek, M.; Priesemann, V. Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions. Science 2020, 369, eabb9789. [Google Scholar] [CrossRef]
Friesen, M. Statistical surveillance. Optimality and methods. Int. Stat. Review 2003, 71, 403–434. [Google Scholar]
Friesen, M.; Andersson, E.; Schiöler, L. Robust outbreak surveillance of epidemics in Sweden. Stat. Med. 2009, 28, 476–493. [Google Scholar]
Brauner, J.M.; Mindermann, S.; Sharma, M.; Stephenson, A.B.; Gavenciak, T.; Johnston, D.; Salvatier, J.; Leech, G.; Besiroglu, T.; Altman, G.; et al. The effectiveness and perceived burden of nonpharmaceutical interventions against COVID-19 transmission: A modelling study with 41 countries. medRxiv Prepr. 2020. [Google Scholar] [CrossRef]
Österreicher, F.; Vajda, I. Statistical information and discrimination. IEEE Trans. Inform. Theory 1993, 39, 1036–1039. [Google Scholar]
De Groot, M.H. Uncertainty, information and sequential experiments. Ann. Math. Statist. 1962, 33, 404–419. [Google Scholar]
Krafft, O.; Plachky, D. Bounds for the power of likelihood ratio tests and their asymptotic properties. Ann. Math. Stat. 1970, 41, 1646–1654. [Google Scholar]
Basawa, I.V.; Scott, D.J. Efficient tests for branching processes. Biometrika 1976, 63, 531–536. [Google Scholar]
Feigin, P.D. The efficiency criteria problem for stochastic processes. Stoch. Proc. Appl. 1978, 6, 115–127. [Google Scholar]
Sweeting, T.J. On efficient tests for branching processes. Biometrika 1978, 65, 123–127. [Google Scholar]
Linkov, Y.N. Lectures in Mathematical Statistics, Parts 1 and 2; American Mathematical Society: Providence, RI, USA, 2005. [Google Scholar]
Feller, W. Diffusion processes in genetics. In Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability; Neyman, J., Ed.; University of California Press: Berkeley, CA, USA, 1951; pp. 227–246. [Google Scholar]
Jirina, M. On Feller’s branching diffusion process. Časopis Pěst. Mat. 1969, 94, 84–89. [Google Scholar]
Lamperti, J. Limiting distributions for branching processes. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II, Part 2; Le Cam, L.M., Neyman, J., Eds.; University of California Press: Berkeley, CA, USA, 1967; pp. 225–241. [Google Scholar]
Lamperti, J. The limit of a sequence of branching processes. Z. Wahrscheinlichkeitstheorie Verw. Geb. 1967, 7, 271–288. [Google Scholar]
Lindvall, T. Convergence of critical Galton-Watson branching processes. J. Appl. Prob. 1972, 9, 445–450. [Google Scholar]
Lindvall, T. Limit theorems for some functionals of certain Galton-Watson branching processes. Adv. Appl. Prob. 1974, 6, 309–321. [Google Scholar]
Grimvall, A. On the convergence of sequences of branching processes. Ann. Probab. 1974, 2, 1027–1045. [Google Scholar]
Borovkov, K.A. On the convergence of branching processes to a diffusion process. Theor. Probab. Appl. 1986, 30, 496–506. [Google Scholar]
Ethier, S.N.; Kurtz, T.G. Markov Processes: Characterization and Convergence; Wiley: New York, NY, USA, 1986. [Google Scholar]
Durrett, R. Stochastic Calculus; CRC Press: Boca Raton, FL, USA, 1996. [Google Scholar]
Kawazu, K.; Watanabe, S. Branching processes with immigration and related limit theorems. Theor. Probab. Appl. 1971, 16, 36–54. [Google Scholar]
Wei, C.Z.; Winnicki, J. Some asymptotic results for the branching process with immigration. Stoch. Process. Appl. 1989, 31, 261–282. [Google Scholar]
Sriram, T.N. Invalidity of bootstrap for critical branching processes with immigration. Ann. Stat. 1994, 22, 1013–1023. [Google Scholar]
Li, Z. Branching processes with immigration and related topics. Front. Math. China 2006, 1, 73–97. [Google Scholar]
Dawson, D.A.; Li, Z. Skew convolution semigroups and affine Markov processes. Ann. Probab. 2006, 34, 1103–1142. [Google Scholar]
Cox, J.C.; Ingersoll, J.E., Jr.; Ross, S.A. A theory of the term structure of interest rates. Econometrica 1985, 53, 385–407. [Google Scholar]
Cox, J.C.; Ross, S.A. The valuation of options for alternative processes. J. Finan. Econ. 1976, 3, 145–166. [Google Scholar]
Heston, S.L. A closed-form solution for options with stochastic volatilities with applications to bond and currency options. Rev. Finan. Stud. 1993, 6, 327–343. [Google Scholar]
Lansky, P.; Lanska, V. Diffusion approximation of the neuronal model with synaptic reversal potentials. Biol. Cybern. 1987, 56, 19–26. [Google Scholar]
Giorno, V.; Lansky, P.; Nobile, A.G.; Ricciardi, L.M. Diffusion approximation and first-passage-time problem for a model neuron. Biol. Cybern. 1988, 58, 387–404. [Google Scholar]
Lanska, V.; Lansky, P.; Smith, C.E. Synaptic transmission in a diffusion model for neuron activity. J. Theor. Biol. 1994, 166, 393–406. [Google Scholar]
Lansky, P.; Sacerdote, L.; Tomassetti, F. On the comparison of Feller and Ornstein-Uhlenbeck models for neural activity. Biol. Cybern. 1995, 73, 457–465. [Google Scholar]
Ditlevsen, S.; Lansky, P. Estimation of the input parameters in the Feller neuronal model. Phys. Rev. E 2006, 73, 061910. [Google Scholar]
Höpfner, R. On a set of data for the membrane potential in a neuron. Math. Biosci. 2007, 207, 275–301. [Google Scholar]
Lansky, P.; Ditlevsen, S. A review of the methods for signal estimation in stochastic diffusion leaky integrate-and-fire neuronal models. Biol. Cybern. 2008, 99, 253–262. [Google Scholar]
Pedersen, A.R. Estimating the nitrous oxide emission rate from the soil surface by means of a diffusion model. Scand. J. Stat. Theory Appl. 2000, 27, 385–403. [Google Scholar]
Aalen, O.O.; Gjessing, H.K. Survival models based on the Ornstein-Uhlenbeck process. Lifetime Data Anal. 2004, 10, 407–423. [Google Scholar]
Kammerer, N.B. Generalized-Relative-Entropy Type Distances Between Some Branching Processes and Their Diffusion Limits. Ph.D. Thesis, University of Erlangen-Nürnberg, Erlangen, Germany, 2011. [Google Scholar]

Figure 1. Bayes risk bounds (using

λ = 0.5

(red/orange) resp.

λ = 0.9

(blue/cyan)) and Bayes risk simulations (lightgrey/grey/black) on a unit (left graph) and logarithmic (right graph) scale in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (1.2, 0.9, 4, 3) \in P_{SP, 1}

, with initial population

X_{0} = 5

and prior-loss constants

L_{A} = 300

and

L_{H} = 150

.

Figure 1. Bayes risk bounds (using

λ = 0.5

(red/orange) resp.

λ = 0.9

(blue/cyan)) and Bayes risk simulations (lightgrey/grey/black) on a unit (left graph) and logarithmic (right graph) scale in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (1.2, 0.9, 4, 3) \in P_{SP, 1}

, with initial population

X_{0} = 5

and prior-loss constants

L_{A} = 300

and

L_{H} = 150

.

Figure 2. Different lower bounds

E_{n}^{L}

(using

λ \in {1.1, 1.5, 2}

) and upper bounds

E_{n}^{U}

(using

λ \in {0.3, 0.5, 0.7}

) of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

for fixed level

ς = 0.05

in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 1.2, 1, 4) \in P_{SP, 1}

together with initial population

X_{0} = 5

on both a unit scale (left graph) and a logarithmic scale (right graph).

Figure 2. Different lower bounds

E_{n}^{L}

(using

λ \in {1.1, 1.5, 2}

) and upper bounds

E_{n}^{U}

(using

λ \in {0.3, 0.5, 0.7}

) of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

for fixed level

ς = 0.05

in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 1.2, 1, 4) \in P_{SP, 1}

together with initial population

X_{0} = 5

on both a unit scale (left graph) and a logarithmic scale (right graph).

Figure 3. The lower bound

E_{n}^{L}

(using

λ = 1.1

) and the upper bound

E_{n}^{U}

(using

λ = 0.5

) of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

for different levels

ς \in {0.01, 0.05, 0.1}

in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 1.2, 1, 4) \in P_{SP, 1}

together with initial population

X_{0} = 5

on both a unit scale (left graph) and a logarithmic scale (right graph).

Figure 3. The lower bound

E_{n}^{L}

(using

λ = 1.1

) and the upper bound

E_{n}^{U}

(using

λ = 0.5

) of the minimal type II error probability

E_{ς} (P_{A, n} ∥ P_{H, n})

for different levels

ς \in {0.01, 0.05, 0.1}

in the parameter setup

(β_{A}, β_{H}, α_{A}, α_{H}) = (0.3, 1.2, 1, 4) \in P_{SP, 1}

together with initial population

X_{0} = 5

on both a unit scale (left graph) and a logarithmic scale (right graph).

Figure 4. Simulation of the process

{\tilde{X}}_{s}^{(m)}

for the approximation steps

m \in {13, 50, 200, 1000}

in the parameter setup

(η, κ_{•}, σ) = (5, 2, 0.4)

and with initial starting value

{\tilde{X}}_{0} = 3

.

Figure 4. Simulation of the process

{\tilde{X}}_{s}^{(m)}

for the approximation steps

m \in {13, 50, 200, 1000}

in the parameter setup

(η, κ_{•}, σ) = (5, 2, 0.4)

and with initial starting value

{\tilde{X}}_{0} = 3

.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kammerer, N.B.; Stummer, W. Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics. Entropy 2020, 22, 874. https://doi.org/10.3390/e22080874

AMA Style

Kammerer NB, Stummer W. Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics. Entropy. 2020; 22(8):874. https://doi.org/10.3390/e22080874

Chicago/Turabian Style

Kammerer, Niels B., and Wolfgang Stummer. 2020. "Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics" Entropy 22, no. 8: 874. https://doi.org/10.3390/e22080874

APA Style

Kammerer, N. B., & Stummer, W. (2020). Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics. Entropy, 22(8), 874. https://doi.org/10.3390/e22080874

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Some Dissimilarity Measures of Branching Processes and Optimal Decision Making in the Presence of Potential Pandemics

Abstract

1. Introduction

2. The Framework and Application Setups

2.1. Process Setup

2.2. Connections to Time Series of Counts

2.3. Applicability to Epidemiology

2.4. Information Measures

2.5. Decision Making under Uncertainty

2.6. Asymptotical Distinguishability

3. Detailed Recursive Analyses of Hellinger Integrals

3.1. A First Basic Result

3.2. Some Useful Facts for Deeper Analyses

3.3. Detailed Analyses of the Exact Recursive Values, i.e., for the Cases β A , β H , α A , α H ∈ P NI ∪ P SP , 1

3.4. Some Preparatory Basic Facts for the Remaining Cases β A , β H , α A , α H ∈ P SP \ P SP , 1

3.5. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

3.6. Goals for Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

3.7. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 2 × ] 0 , 1 [

3.8. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 a × ] 0 , 1 [

3.9. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 b × ] 0 , 1 [

3.10. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 c × ] 0 , 1 [

3.11. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 4 a × ] 0 , 1 [

3.12. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 4 b × ] 0 , 1 [

3.13. Concluding Remarks on Alternative Upper Bounds for all Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

3.14. Intermezzo 1: Application to Asymptotical Distinguishability

3.15. Intermezzo 2: Application to Decision Making under Uncertainty

3.15.1. Bayesian Decision Making

3.15.2. Neyman-Pearson Testing

3.16. Goals for Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

3.17. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 2 × ( R \ [ 0 , 1 ] )

3.18. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 a × ( R \ [ 0 , 1 ] )

3.19. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 b × ( R \ [ 0 , 1 ] )

3.20. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 3 c × ( R \ [ 0 , 1 ] )

3.21. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 4 a × ( R \ [ 0 , 1 ] )

3.22. Lower Bounds for the Cases β A , β H , α A , α H , λ ∈ P SP , 4 b × ( R \ [ 0 , 1 ] )

3.23. Concluding Remarks on Alternative Lower Bounds for all Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

3.24. Upper Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

4. Power Divergences of Non-Kullback-Leibler-Information-Divergence Type

4.1. A First Basic Result

4.2. Detailed Analyses of the Exact Recursive Values of I λ ( · ∥ · ) , i.e., for the Cases β A , β H , α A , α H , λ ∈ ( P NI ∪ P SP , 1 ) × ( R \ { 0 , 1 } )

4.3. Lower Bounds of I λ ( · ∥ · ) for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

4.4. Upper Bounds of I λ ( · ∥ · ) for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

4.5. Lower Bounds of I λ ( · ∥ · ) for the Cases ( β A , β H , α A , α H , λ ) ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

4.6. Upper Bounds of I λ ( · ∥ · ) for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

4.7. Applications to Bayesian Decision Making

5. Kullback-Leibler Information Divergence (Relative Entropy)

5.1. Exact Values Respectively Upper Bounds of I ( · | | · )

5.2. Lower Bounds of I ( · | | · ) for the Cases β A , β H , α A , α H ∈ ( P SP \ P SP , 1 )

5.3. Applications to Bayesian Decision Making

6. Explicit Closed-Form Bounds of Hellinger Integrals

6.1. Principal Approach

6.2. Explicit Closed-Form Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P NI ∪ P SP , 1 ) × ( R \ { 0 , 1 } )

6.3. Explicit Closed-Form Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ] 0 , 1 [

6.4. Explicit Closed-Form Bounds for the Cases β A , β H , α A , α H , λ ∈ ( P SP \ P SP , 1 ) × ( R \ [ 0 , 1 ] )

6.5. Totally Explicit Closed-Form Bounds

6.6. Closed-Form Bounds for Power Divergences of Non-Kullback-Leibler-Information-Divergence Type

6.7. Applications to Decision Making

7. Hellinger Integrals and Power Divergences of Galton-Watson Type Diffusion Approximations

7.1. Branching-Type Diffusion Approximations

7.2. Bounds of Hellinger Integrals for Diffusion Approximations

7.3. Bounds of Power Divergences for Diffusion Approximations

7.4. Applications to Decision Making

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Proofs and Auxiliary Lemmas

Appendix A.1. Proofs and Auxiliary Lemmas for Section 3

Appendix A.2. Proofs and Auxiliary Lemmas for Section 5

Appendix A.3. Proofs and Auxiliary Lemmas for Section 6

Appendix A.4. Proofs and Auxiliary Lemmas for Section 7

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.3. Detailed Analyses of the Exact Recursive Values, i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{NI} \cup P_{SP, 1}$

3.4. Some Preparatory Basic Facts for the Remaining Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in P_{SP} \ P_{SP, 1}$

3.5. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

3.6. Goals for Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

3.7. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times] 0, 1 [$

3.8. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times] 0, 1 [$

3.9. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times] 0, 1 [$

3.10. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times] 0, 1 [$

3.11. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times] 0, 1 [$

3.12. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times] 0, 1 [$

3.13. Concluding Remarks on Alternative Upper Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in$ $(P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

3.16. Goals for Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

3.17. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 2} \times (R \ [0, 1])$

3.18. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 a} \times (R \ [0, 1])$

3.19. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 b} \times (R \ [0, 1])$

3.20. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 3 c} \times (R \ [0, 1])$

3.21. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 a} \times (R \ [0, 1])$

3.22. Lower Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in P_{SP, 4 b} \times (R \ [0, 1])$

3.23. Concluding Remarks on Alternative Lower Bounds for all Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

3.24. Upper Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

4.2. Detailed Analyses of the Exact Recursive Values of $I_{λ} (\cdot ∥ \cdot)$ , i.e., for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$

4.3. Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

4.4. Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

4.5. Lower Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

4.6. Upper Bounds of $I_{λ} (\cdot ∥ \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$

5.1. Exact Values Respectively Upper Bounds of $I (\cdot | | \cdot)$

5.2. Lower Bounds of $I (\cdot | | \cdot)$ for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}) \in (P_{SP} \ P_{SP, 1})$

6.2. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{NI} \cup P_{SP, 1}) \times (R \ {0, 1})$

6.3. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times] 0, 1 [$

6.4. Explicit Closed-Form Bounds for the Cases $(β_{A}, β_{H}, α_{A}, α_{H}, λ) \in (P_{SP} \ P_{SP, 1}) \times (R \ [0, 1])$