Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics

Stojičić, Snežana; Stojanović, Vladica S.; Jovanović, Mihailo; Joksimović, Dušan; Radovanović, Radovan

doi:10.3390/math14050754

Open AccessArticle

Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics

by

Snežana Stojičić

¹,

Vladica S. Stojanović

^2,*

,

Mihailo Jovanović

²,

Dušan Joksimović

²

and

Radovan Radovanović

¹

Department of Forensic Engineering, University of Criminal Investigation and Police Studies, 11000 Belgrade, Serbia

²

Department of Informatics & Computer Sciences, University of Criminal Investigation and Police Studies, 11000 Belgrade, Serbia

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(5), 754; https://doi.org/10.3390/math14050754

Submission received: 20 January 2026 / Revised: 19 February 2026 / Accepted: 20 February 2026 / Published: 24 February 2026

(This article belongs to the Special Issue Artificial and Computational Intelligence Innovations in Symbolic and Soft Computing for Theoretical and Applied Science)

Download

Browse Figures

Versions Notes

Abstract

The manuscript proposes a new non-linear and non-stationary bivariate stochastic model, termed the two-dimensional Gaussian (generalized) Split-BREAK (2D-GSB) process, as a multivariate extension of the univariate GSB framework. The generalization consists in introducing a common threshold mechanism based on the norm of a bivariate innovation vector and a single synchronized Bernoulli indicator which jointly governs regime activation in both components. This structure induces cross-dependent regime shifts and yields a binomial–Gaussian mixture representation of the joint distribution, explicitly linking contemporaneous dependence with a common latent regime mechanism. The fundamental properties of the proposed model are established, with particular emphasis on its asymptotic behavior. Parameter estimation procedure is developed using both the method of moments (MoM) and the empirical characteristic function (ECF) approach, and their performance is evaluated through Monte Carlo simulations. An empirical application to daily crime data illustrates how the proposed framework captures synchronized structural shocks and heavy-tailed features in related crime categories. In comparison with a standard VAR(1) benchmark, the 2D-GSB specification provides a parsimonious yet substantially improved likelihood-based fit, thus offering a theoretically sound framework for analyzing multivariate time series characterized by synchronized regime shifts and heavy-tailed behavior.

Keywords:

bivariate stochastic processes; pronounced fluctuations; non-stationarity; bivariate Gaussian distribution; asymptotic properties; crime dynamics

MSC:

60E10; 60F05; 62M10

1. Introduction

The stochastic permanent break (STOPBREAK) process was first proposed by Engle and Smith [1] as a framework for modeling time series with permanent breaks and large fluctuations. Subsequently, many authors have considered models that rely, to a greater or lesser extent, on the concepts underlying the STOPBREAK process, primarily in the field of econometrics [2,3,4,5,6,7,8,9,10]. One generalization of this process, termed the Split-BREAK process, was proposed by Stojanović et al. [11] and Jovanović et al. [12]. Building on this framework, the Gaussian (or generalized) Split-BREAK (GSB) process is developed, extending the original model by introducing Gaussian innovations and latent dynamics with a Bernoulli noise indicator. Note that in the existing literature, the term “generalized” refers to the latent-mixture construction introduced in [11], which generalizes the basic STOPBREAK model introduced in [1]. Accordingly, in the present paper the innovation distribution is explicitly assumed to be Gaussian. In the aforementioned works, the distributive and asymptotic properties of the GSB process were established, along with various parameter estimation methods. In addition, the main distributional characteristics and empirical characteristic function (ECF)-based estimation procedures are discussed in detail in Stojanović et al. [13], while modifications involving Laplace and Cauchy innovations are presented in Stojanović et al. [14] and Ljajko et al. [15], respectively.

From a practical perspective, the GSB process has been applied to various empirical contexts, including stock market capitalization on the Belgrade Stock Exchange and the dynamics of infected and vaccinated individuals during the COVID-19 pandemic. These applications highlight the broader practical potential of the GSB model in economic and social dynamics. On the other hand, many real-world phenomena are manifested through several interrelated time series that evolve jointly. For instance, in a criminological context, time series representing ordinary robberies and aggravated robberies (organized or violent) are often correlated. In addition, common social factors (economic crisis, change in police strategies, or seasonal effects) may simultaneously generate spikes in both series, while each retains its own dynamics and internal structural properties.

While the univariate GSB model accounts for threshold-driven regime behavior within a single series, it cannot explicitly describe synchronized correlation structures across multiple time series. A key structural feature of the proposed 2D-GSB framework is the use of a single synchronized Bernoulli indicator

q_{t} (c)

, determined by the magnitude of the common innovation vector, whose formal definition is provided in Section 2. This structural modification represents the essential extension from the univariate GSB model and enables both components of the process to enter or exit a regime simultaneously, reflecting situations in which related time series (e.g., different but related crime types) are jointly affected by a common external shock (e.g., policy change, economic crisis). By linking regime activation directly to the innovation norm, the model captures coordinated abrupt changes in a transparent and analytically tractable manner, thereby extending the previous univariate GSB structure to the multivariate setting.

In particular, conventional VAR and multivariate GARCH frameworks may struggle to capture synchronized regime shifts driven by common shocks, since regime activation is not explicitly linked to innovation magnitudes. Instead of relying solely on linear dependence or volatility-based dynamics, typical of the aforementioned conventional multivariate specifications, the proposed 2D-GSB model includes a common threshold-driven regime indicator, determined by the magnitude of multivariate innovations. In this way, it allows for the capture of synchronized structural shocks and persistent regime effects within a single latent framework. More broadly, the proposed 2D-GSB framework is related to multivariate regime-switching and latent-state models, where structural changes are governed by unobserved mechanisms (for some recent ones, see, e.g., [16,17]). Unlike classical Markov-switching approaches, where regime transitions are driven by latent state dynamics, the regime activation in the 2D-GSB model is directly determined by the magnitude of innovations through a threshold mechanism. This places the proposed framework within the broader class of regime-dependent multivariate models while preserving analytical tractability through its binomial–Gaussian mixture representation.

In the following Section 2, the definition and basic stochastic structure of the 2D-GSB process are presented. Thereafter, Section 3 discusses additional stochastic properties of the 2D-GSB model, focusing on the distribution of its components and their asymptotic behavior. For this purpose, as in the one-dimensional GSB case, the characteristic function (CF) method is employed. Section 4 describes estimation procedures for the unknown parameters of the 2D-GSB process and establishes the asymptotic properties of the resulting estimators. In Section 5, Monte Carlo simulations of the proposed estimators are carried out, as well as an empirical application of the 2D-GSB process in the analysis of the dynamics of certain criminal offenses. Finally, Section 6 provides concluding remarks.

2. Definition and Structure of the 2D-GSB Process

As mentioned above, the two-dimensional generalized (or Gaussian) Split-BREAK (2D-GSB) process extends the univariate GSB framework by allowing two interrelated time series to evolve under regime-switching dynamics driven by large innovations. When the regime indicators are synchronized across both series, the resulting correlation structure and joint dynamics become particularly rich and suitable for modeling real-world systems characterized by parallel disturbances or transitions. The basic definitions of the 2D-GSB series can be given as follows.

Let

Y_{t} = {(Y_{t}^{(1)}, Y_{t}^{(2)})}^{⊤}

be a bivariate time series defined on discrete time

t \in N_{0}

. The process

(Y_{t})

evolves according to the 2D-GSB structure if the following equality holds:

Y_{t} = [\begin{matrix} Y_{t}^{(1)} \\ Y_{t}^{(2)} \end{matrix}] = μ_{t} + ε_{t} .

(1)

Here,

ε_{t} = {(ε_{t}^{(1)}, ε_{t}^{(2)})}^{⊤} ~ N (0_{2 \times 1}, Σ)

are the independent identically distributed (IID) innovations with bivariate Gaussian distribution and the variance–covariance matrix

Σ = [\begin{matrix} σ_{1}^{2} & ρ σ_{1} σ_{2} \\ ρ σ_{1} σ_{2} & σ_{2}^{2} \end{matrix}],

where

σ_{1}, σ_{2} > 0

are the standard deviations of univariate series

(ε_{t}^{(1)}), (ε_{t}^{(2)})

, respectively, and

|ρ| < 1

is their correlation. We also suppose that innovations

(ε_{t})

are defined based on the same probability space

(Ω, F, P)

, expanded by some filtration

F = (F_{t}) .

Thus, the conditional mean and variance of

(ε_{t})

are, respectively,

E (ε_{t}| F_{t - 1}) = {(0,0)}^{⊤} = 0_{2 \times 1}, V a r (ε_{t}| F_{t - 1}) = E (ε_{t}^{2}| F_{t - 1}) = Σ .

(2)

Additionally,

μ_{t} = {(μ_{t}^{(1)}, μ_{t}^{(2)})}^{⊤}

is a regime-sensitive latent mean vector process, termed the martingale mean process, defined by the recurrence relation:

μ_{t}^{(i)} = μ_{t - 1}^{(i)} + q_{t - 1} ε_{t - 1}^{(i)} = μ_{0}^{(i)} + \sum_{j = 0}^{t - 1} q_{j} ε_{j}^{(i)}, i = 1,2 .

(3)

Here, the equalities

μ_{0}^{(i)} \overset{a s}{=} m_{i} (c o n s t .)

,

i = 1,2

hold almost surely (a.s.), while

(q_{t})

is the noise indicator, i.e., the Bernoulli process defined by the equalities

q_{t} (c) = I \{{‖ε_{t - 1}‖}^{2} > c\} = \{\begin{matrix} 1, {(ε_{t - 1}^{(1)})}^{2} + {(ε_{t - 1}^{(2)})}^{2} > c \\ 0, {(ε_{t - 1}^{(1)})}^{2} + {(ε_{t - 1}^{(2)})}^{2} \leq c \end{matrix}, t = 1,2, \dots

(4)

where

‖\cdot‖

is the usual Euclidean norm in

R^{2}

and

q_{0} (c) \equiv 1

. The value

c > 0

represents the critical value of reaction, i.e., the magnitude of past innovations

(ε_{t})

determines whether the current innovation is incorporated into the latent process (

μ_{t}),

as defined in Equation (3). More precisely, when

q_{t - 1} = 0

, there is no change in the martingale means

μ_{t}^{(i)}, i = 1,2,

relative to their previous values

μ_{t - 1}^{(i)}

, implying that

Y_{t}^{(i)}

exhibits only a “small” fluctuation driven solely by

ε_{t}^{(i)}

. Conversely, when

q_{t - 1} = 1,

the process undergoes a pronounced (permanent) shift, resulting in a more substantial fluctuation of

Y_{t}^{(i)}

.

Thus, the regime switch in the 2D-GSB process is driven by an innovation threshold; that is, realized values of

(ε_{t})

determine the intensity of fluctuations in the underlying series

(Y_{t})

. Moreover, the synchronized indicator

q_{t} (c)

, defined in Equation (4), enables joint regime detection across both components of

(ε_{t})

, capturing structurally linked shocks in correlated innovation processes

(ε_{t}^{(i)}), i = 1,2

. This feature is particularly important when modeling interdependent time series, such as different but related categories of crime, which are discussed below. It also facilitates a coherent interpretation of latent dynamics and common external shocks.

Now, similar to the one-dimensional case (see, e.g., Jovanović et al. [12]), the following properties of the above-mentioned time series can be proven.

Theorem 1.

Let

(μ_{t})

and

(Y_{t})

be the two-dimensional stochastic processes defined by Equations (1)–(4). Then, both of these series have a constant and equal mean values

E (μ_{t}) = E (Y_{t}) = {(m_{1}, m_{2})}^{⊤},

while their covariance matrices are, respectively,

Γ_{μ} (s, t) = C o v (μ_{s}, μ_{t}) = \{\begin{matrix} p_{c} \cdot t \cdot Σ, & s = t \\ p_{c} \cdot \min (s, t) \cdot Σ, & s \neq t \end{matrix},

(5)

Γ_{Y} (s, t) = C o v (Y_{s}, Y_{t}) = \{\begin{matrix} (1 + p_{c} t) \cdot Σ, & s = t \\ p_{c} \cdot (1 + \min (s, t)) \cdot Σ, & s \neq t . \end{matrix}

(6)

where

p_{c} = E (q_{t}) = P \{{‖ε_{t - 1}‖}^{2} > c\}

.

Proof.

First, note that (

q_{t} ε_{t}^{(i)})

,

i = 1,2

are the uncorrelated random variables (RVs), with:

E (q_{t} ε_{t}^{(i)}) = E (q_{t}) E (ε_{t}^{(i)}) = 0, V a r (q_{t} ε_{t}^{(i)}) = E (q_{t}^{2} {(ε_{t}^{(i)})}^{2}) = E (q_{t}) V a r (ε_{t}^{(i)}) = p_{c} σ_{i}^{2} .

Thus, according to Equality (3), it immediately follows that

E (μ_{t}^{(i)}) = E (μ_{0}^{(i)}) + \sum_{j = 0}^{t - 1} E (q_{j} ε_{j}^{(i)}) = m_{i}, i = 1,2,

while for the variance of the series

(μ_{t}^{(i)})

one obtains:

V a r (μ_{t}^{(i)}) = V a r (μ_{0}^{(i)}) + \sum_{j = 0}^{t - 1} V a r (q_{t - 1}^{(i)} ε_{t - 1}^{(i)}) = p_{c} t σ_{i}^{2}, t \geq 0 .

(7)

Similarly, the cross-covariance between

(μ_{t}^{(1)})

and

(μ_{t}^{(2)})

can be calculated as follows

C o v (μ_{t}^{(1)}, μ_{t}^{(2)}) = E (m_{1} + \sum_{j = 0}^{t - 1} q_{j} ε_{j}^{(1)}) (m_{2} + \sum_{k = 0}^{t - 1} q_{k} ε_{k}^{(2)}) - m_{1} m_{2} = \sum_{j = 0}^{t - 1} E (q_{j}^{2} ε_{j}^{(1)} ε_{j}^{(2)}) = p_{c} t ρ σ_{1} σ_{2},

(8)

so Equations (7) and (8) confirm the first one in Equation (5). The covariance of the vectors

μ_{s}, μ_{t}

when

s \neq t

can be obtained in a similar way. First, using Equations (3) and (7), for each

t > s \geq 0

one obtains

C o v (μ_{s}^{(i)}, μ_{t}^{(i)}) = E (μ_{s}^{(i)} μ_{t}^{(i)}) - m_{i}^{2} = E [μ_{s}^{(i)} (μ_{s}^{(i)} + \sum_{k = s}^{t - 1} q_{k} ε_{k}^{(i)})] - m_{i}^{2} = V a r (μ_{s}^{(i)}) = p_{c} s σ_{i}^{2}, i = 1,2,

and when

i \neq j,

it follows:

C o v (μ_{s}^{(i)}, μ_{t}^{(j)}) = E (μ_{s}^{(i)} μ_{t}^{(j)}) - m_{1} m_{2} = E [μ_{s}^{(i)} (μ_{s}^{(j)} + \sum_{k = s}^{t - 1} q_{k} ε_{k}^{(j)})] - m_{1} m_{2} = C o v (μ_{s}^{(i)}, μ_{s}^{(j)}) = p_{c} s ρ σ_{1} σ_{2} .

Obviously, the last two equalities confirm the second one in Equation (5).

The appropriate equalities for the basic GSB-series

(Y_{t}^{(i)}), i = 1,2,

can be obtained in a similar way. Namely, according to Equation (1) one obtains

E (Y_{t}^{(i)}) = E (μ_{t}^{(i)}) + E (ε_{t}^{(i)}) = m_{i}, i = 1,2,

while for the variance of

(Y_{t}^{(i)})

we get

V a r (Y_{t}^{(i)}) = V a r (μ_{t}^{(i)}) + V a r (ε_{t}^{(i)}) = (1 + p_{c} t) σ_{i}^{2}, t \geq 0 .

(9)

In addition, the cross-covariance between

(Y_{t}^{(1)})

and

(Y_{t}^{(2)})

reads as

C o v (Y_{t}^{(1)}, Y_{t}^{(2)}) = C o v (μ_{t}^{(1)}, μ_{t}^{(2)}) + C o v (ε_{t}^{(1)}, ε_{t}^{(2)}) = (1 + p_{c} t) ρ σ_{1} σ_{2},

(10)

thus confirming the first part of Equation (6). Finally, using the previous results and Equation (1), for the vectors

Y_{s}, Y_{t}

and any

t > s \geq 0

one obtains:

C o v (Y_{s}^{(i)}, Y_{t}^{(i)}) = E (Y_{s}^{(i)} Y_{t}^{(i)}) - m_{i}^{2} = E (μ_{s}^{(i)} μ_{t}^{(i)}) + E (ε_{s}^{(i)} μ_{t}^{(i)}) - m_{i}^{2} = C o v (μ_{s}^{(i)}, μ_{t}^{(i)}) + p_{c} V a r (ε_{s}^{(i)}) = p_{c} σ_{i}^{2} (s + 1), C o v (Y_{s}^{(i)}, Y_{t}^{(j)}) = E (Y_{s}^{(i)} Y_{t}^{(j)}) - m_{1} m_{2} = C o v (μ_{s}^{(i)} μ_{t}^{(j)}) + E (ε_{s}^{(i)} μ_{t}^{(j)}) = p_{c} ρ σ_{1} σ_{1} (s + 1), i \neq j .

From here, the second equality in (6) is easily obtained. □

Remark 1.

The previous theorem provides some additional insight about the stochastic structure of the 2D-GSB process. Unlike the primary GSB vector series

(Y_{t})

, which may exhibit sudden and pronounced fluctuations, the series

(μ_{t})

is

F_{t - 1}

measurable and thus represents the predictable, stable component of the process, as illustrated in the upper panels of Figure 1. In contrast, the innovations

(ε_{t})

constitute the noise component of

(Y_{t})

and represent the main source of abrupt fluctuations in the 2D-GSB framework. Moreover, according to Equations (5) and (6), the covariance matrices of both vector series

(Y_{t})

and

(μ_{t})

are time-dependent and vary with

s, t \geq 0

, indicating the non-stationarity of both of these series.

Finally, using Equations (5)–(10), the autocorrelation functions of the components of these two vector series are easily obtained as follows:

C o r r (μ_{s}^{(i)}, μ_{t}^{(i)}) = \frac{C o v (μ_{s}^{(i)}, μ_{t}^{(i)})}{\sqrt{V a r (μ_{s}^{(i)})} \cdot \sqrt{V a r (μ_{t}^{(i)})}} = \{\begin{matrix} \frac{\min (s, t)}{\sqrt{s \cdot t}}, & s \neq t \\ 1, & s = t . \end{matrix} . C o r r (Y_{s}^{(i)}, Y_{t}^{(i)}) = \frac{C o v (Y_{s}^{(i)}, Y_{t}^{(i)})}{\sqrt{V a r (Y_{s}^{(i)})} \cdot \sqrt{V a r (Y_{t}^{(i)})}} = \{\begin{matrix} \frac{p_{c} (1 + \min (s, t))}{\sqrt{(1 + p_{c} s) \cdot (1 + p_{c} t)}}, & s \neq t \\ 1, & s = t . \end{matrix}

It should be noted that these results are completely analogous to the results of the one-dimensional GSB process (see, e.g., Stojanović et al. [11]). Hence, the autocorrelations of the martingale mean values are L²-continuous, because

\lim_{s \to t} C o r r (μ_{s}^{(i)}, μ_{t}^{(i)}) = \frac{t}{\sqrt{t^{2}}} = 1,

while the autocorrelations of

(Y_{t})

clearly do not satisfy the L²-continuity condition.

Additionally, note that autocorrelations, as well as variances, are more pronounced with series

(Y_{t})

, in relation to

(μ_{t})

, which is a consequence of the inclusion of an additional noise term and can be seen in the lower graphs of Figure 1. Finally, using Equations (7)–(10), for the intercorrelation between the components of the above processes, one easily obtains:

C o r r (μ_{t}^{(1)}, μ_{t}^{(2)}) = C o r r (Y_{t}^{(1)}, Y_{t}^{(2)}) = ρ .

Thus, both intercorrelations are constant and equal to the intercorrelations of the components of the innovations

(ε_{t})

, regardless of time

t

. This means that all observed processes develop in the same way, i.e., they have the same intercorrelation as between the components of the innovation series.

In the following, we define another component of the 2D-GSB process, the so-called increments, as the vector series

X_{t} = {(X_{t}^{(1)}, X_{t}^{(2)})}^{⊤}

which satisfies the following equalities:

X_{t}^{(i)} = Y_{t}^{(i)} - Y_{t - 1}^{(i)} = (μ_{t}^{(i)} - μ_{t - 1}^{(i)}) + (ε_{t}^{(i)} - ε_{t - 1}^{(i)}), i = 1,2, t = 1,2, \dots

(11)

According to Equations (3), (4) and (11), it is easy to see that the components of the vector series

(X_{t})

can be presented as follows:

X_{t}^{(i)} = ε_{t}^{(i)} - r_{t - 1} ε_{t - 1}^{(i)}, i = 1,2, t = 1,2, \dots,

(12)

where

r_{t} = 1 - q_{t} = I ({‖ε_{t - 1}‖}^{2} \leq c)

. Thus, the vector series

(X_{t})

is a non-linear stochastic process with random coefficient

r_{t}

, which is “close” to the standard moving average (MA) processes (of order 1). More precisely, the series

(X_{t})

operates in two regimes:

(a): Emphasized fluctuations of innovations $(ε_{t})$ in the previous time moment $t - 1$ imply $r_{t - 1} = 0$ , so Equation (12) becomes $X_{t}^{(i)} = ε_{t}^{(i)}$ .
(b): Fluctuations of $ε_{t - 1}$ which do not exceed the critical value $c$ imply $r_{t - 1} = 1$ . Then, $X_{t}^{(i)}$ is in the form of a standard, linear MA(1) process, i.e., $X_{t}^{(i)} = ε_{t}^{(i)} - ε_{t - 1}^{(i)}$ .

According to this, the vector series

(X_{t})

has similar properties to the vector MA(1) models, and its stochastic properties (mean, covariance matrix, and autocorrelation function) can be obtained as given below.

Theorem 2.

Let

(X_{t})

be the vector series defined by Equations (11) and (12). Then,

(X_{t})

is the zero-mean vector series with the covariance matrix:

Γ_{X} (h) = C o v (X_{t}, X_{t + h}) = \{\begin{matrix} \begin{matrix} (2 - p_{c}) \cdot Σ, \\ (p_{c} - 1) \cdot Σ, \\ 0_{2 \times 2}, \end{matrix} & \begin{matrix} h = 0 \\ h = \pm 1 \\ |h| \geq 2 . \end{matrix} \end{matrix}

(13)

In addition, the autocorrelation of the components

X_{t}^{(i)},

i = 1,2,

reads as follows:

ρ_{X} (h) ≔ \frac{C o v (X_{t}^{(i)}, X_{t + h}^{(i)})}{V a r (X_{t}^{(i)})} = \{\begin{matrix} (\begin{matrix} 1, \\ p_{c} - 1) / (2 - p_{c}), \\ 0, \end{matrix} & \begin{matrix} h = 0 \\ h = \pm 1 \\ |h| \geq 2 . \end{matrix} \end{matrix}

(14)

Proof.

According to definitions of the vector series

(X_{t}),

given by Equations (11) and (12), it immediately follows:

E (X_{t}) = E (ε_{t}) - E ({r_{t - 1} ε}_{t - 1}) = 0_{2 \times 1} .

Further, similarly to the proof of the previous theorem, Equation (13) is obtained from the equalities:

C o v (X_{t}^{(i)}, X_{t + h}^{(i)}) = E (X_{t}^{(i)} X_{t + h}^{(i)}) = E [(ε_{t}^{(i)} - r_{t - 1} ε_{t - 1}^{(i)}) (ε_{t + h}^{(i)} - r_{t + h - 1} ε_{t + h - 1}^{(i)})] = \{\begin{matrix} \begin{matrix} (2 - p_{c}) σ_{i}^{2}, \\ (p_{c} - 1) σ_{i}^{2}, \\ 0, \end{matrix} & \begin{matrix} h = 0 \\ h = \pm 1 \\ |h| \geq 2 \end{matrix} \end{matrix},

(15)

C o v (X_{t}^{(i)}, X_{t + h}^{(j)}) = E (X_{t}^{(i)} X_{t + h}^{(j)}) = E [(ε_{t}^{(i)} - r_{t - 1} ε_{t - 1}^{(i)}) (ε_{t + h}^{(j)} - r_{t + h - 1} ε_{t + h - 1}^{(j)})] = \{\begin{matrix} \begin{matrix} (2 - p_{c}) ρ σ_{1} σ_{2}, \\ (p_{c} - 1) ρ σ_{1} σ_{2}, \\ 0, \end{matrix} & \begin{matrix} h = 0 \\ h = \pm 1 \\ |h| \geq 2 \end{matrix} \end{matrix},

where

i, j = 1,2

and

i \neq j .

Finally, Equations (13) and (15) directly imply Equation (14). □

Remark 2.

According to the previous theorem, it is obvious that the vector series

(X_{t})

is stationary, with a structure similar to the standard MA(1) vector series. In addition, from Equations (11) and (12), it follows that

Y_{t}^{(i)} - Y_{t - 1}^{(i)} = ε_{t}^{(i)} - r_{t - 1} ε_{t - 1}^{(i)}, i = 1,2, t = 1,2, \dots

which is a non-linear integrated auto-regressive moving average (ARIMA) model with “temporary” components

(r_{t - 1} ε_{t - 1}^{(i)})

. It implies a specific structure and distributional properties of the (stationary) series

(X_{t})

, as well as other (non-stationary) components of the 2D-GSB process, which is discussed in more detail below.

3. Distributional Properties

In this section, we examine selected stochastic properties of the main components of the 2D-GSB process, focusing on their distributional and asymptotic behavior. To this end, as mentioned earlier, the vector series of increments

(X_{t})

plays a central role, primarily due to its stationarity. The key distributional properties of this series are summarized in the following result.

Theorem 3.

Let

q_{t} (c)

be the noise indicator given by Equation (4), and

(X_{t})

the bivariate time series, defined by Equations (11) and (12). Then, for any

x = {(x_{1}, x_{2})}^{⊤} \in R^{2}

and

t = 1,2, \dots,

the cumulative distribution function (CDF) of the bivariate RVs

(X_{t})

is given by

F_{X} (x) ≔ P \{X_{t} < x\} = p_{c} F_{1} (x) + (1 - p_{c}) F_{2} (x),

(16)

where

F_{1} (x)

and

F_{2} (x)

are CDFs of the bivariate Gaussian RVs

ε_{t} ~ N (0, Σ)

and

\sqrt{2} ε_{t} ~ N (0, 2 Σ),

respectively. In addition, for the series

({‖ε_{t}‖}^{2})

the following equality (in the distribution) holds

{‖ε_{t}‖}^{2} \overset{d}{=} λ_{1} Z_{1} + λ_{2} Z_{2},

(17)

where

λ_{1}, λ_{2} > 0

are the eigenvalues of the matrix

Σ

, and

Z_{1}, Z_{2}

are the

χ_{1}^{2}

distributed RVs.

Proof.

Using the conditional probabilities and Equation (12), for the CDF of RVs

(X_{t})

one obtains

F_{X} (x) ≔ P \{X_{t} < x\} = P \{X_{t} < x| r_{t - 1} = 0\} \cdot P \{r_{t - 1} = 0\} + P \{X_{t} < x| r_{t - 1} = 1\} \cdot P \{r_{t - 1} = 1\} = P \{ε_{t} < x\} \cdot p_{c} + P \{ε_{t} - ε_{t - 1} < x\} \cdot (1 - p_{c}),

which immediately implies Equation (16). Furthermore, the eigenvalues

λ_{1}, λ_{2}

of the matrix

Σ \in R^{2 \times 2}

are solutions (in terms of

λ)

of the equation

\det (Σ - λ I) = 0

, i.e.,

λ^{2} - (σ_{1}^{2} + σ_{2}^{2}) λ + (1 - ρ^{2}) σ_{1}^{2} σ_{2}^{2} = 0 .

According to the Vieta formulas, it follows that

λ_{1} + λ_{2} = σ_{1}^{2} + σ_{2}^{2} > 0, λ_{1} λ_{2} = (1 - ρ^{2}) σ_{1}^{2} σ_{2}^{2} > 0,

that is,

λ_{1}, λ_{2} > 0 .

Hence, according to the spectral theorem (see, e.g., [18], Section 29.2.) a positive symmetric matrix

Σ \in R^{2 \times 2}

has the spectral decomposition

Σ = P Λ P^{⊤} .

Herein,

P = ({\vec{v}}_{1}, {\vec{v}}_{2}) \in R^{2 \times 2}

is an orthogonal matrix (

P P^{⊤} = P^{⊤} P = I)

of orthonormal eigenvectors

{\vec{v}}_{1}, {\vec{v}}_{2}

of the matrix

Σ

, and

Λ = d i a g (λ_{1}, λ_{2})

is a diagonal matrix of the eigenvalues

λ_{1}, λ_{2}

. Now, let us define a new vector series

η_{t} = P^{⊤} ε_{t} = {({\vec{v}}_{1}^{⊤} ε_{t}, {\vec{v}}_{2}^{⊤} ε_{t})}^{⊤},

which satisfies the equalities:

E (η_{t}) = P^{⊤} E (ε_{t}) = 0_{2 \times 1}, C o v (η_{t}) = E (η_{t} η_{t}^{⊤}) = E (P^{⊤} ε_{t} ε_{t}^{⊤} P) = P^{⊤} Σ P = P^{⊤} (P Λ P^{⊤}) P = Λ .

Thus,

η_{t} = (η_{t}^{(1)}, η_{t}^{(2)})

is a bivariate Gaussian process, with

η_{t}^{(i)} ~ N (0, λ_{i}),

i = 1,2,

and

η_{t}^{(1)} ⊥ η_{t}^{(2)} .

Finally, Equation (17) follows from

ε_{t} = P η_{t}

and the equalities:

{‖ε_{t}‖}^{2} = ε_{t}^{⊤} {ε_{t} = η}_{t}^{⊤} P^{⊤} P η_{t} = {η_{t}}^{⊤} η_{t} = {‖η_{t}‖}^{2} .

□

Remark 3.

Similar to the one-dimensional case of the GSB process (see, e.g., Jovanović et al. [12]), by differentiating Equation (16), the bivariate probability density function (PDF) of the vector series

(X_{t})

can be obtained as follows:

f_{X} (x) = p_{c} f_{1} (x) + (1 - p_{c}) f_{2} (x) .

Here,

f_{k} (x) = \frac{1}{2 π \sqrt{\det Σ}} \exp (- \frac{1}{2} x^{⊤} {(k Σ)}^{- 1} x), k = 1,2

are the bivariate PDFs of the Gaussian distributions

N (0_{2 \times 1}, k Σ)

, where

k = 1,2 .

Thus, the distribution of increments

(X_{t})

is a convex linear combination (i.e., mixture) of two known 2D normal distributions with zero mean and different variance–covariance matrices. In the same way, using standard procedures, for the bivariate CF of the vectors

(X_{t})

one obtains

φ_{X} (u) = E (\exp i 〈u, X_{t}〉) = p_{c} φ_{ε} (u) + (1 - p_{c}) φ_{\sqrt{2} ε} (u) = p_{c} \exp (- \frac{1}{2} u^{⊤} Σ u) + (1 - p_{c}) \exp (- u^{⊤} Σ u),

(18)

where

u = {(u_{1}, u_{2})}^{⊤} \in R^{2}

and

φ_{ε} (u) = \exp (- u^{⊤} Σ u / 2)

is the CF of the bivariate Gaussian RVs

(ε_{t}) ~ N (0, Σ) .

On the other hand, Equation (17) indicates that RVs

ζ_{t} = {‖ε_{t}‖}^{2}

have the so-called weighted (i.e., mixture)

χ_{2}^{2}

distribution. In general, the distribution of

(ζ_{t})

differs from the “ordinary” chi-square distribution (see Figure 2) and does not have a closed form for its PDF and CDF. Nevertheless, it is easy to see that the CF of this distribution is of the form

φ_{ζ} (u) = E (\exp (i u (λ_{1} Z_{1} + λ_{2} Z_{2}))) = φ_{Z_{1}} (λ_{1} u) φ_{Z_{2}} (λ_{2} u) = {((1 - 2 i λ_{1} u) (1 - 2 i λ_{2} u))}^{- 1 / 2},

where

u \in R .

This fact, along with Equation (18), can be useful in estimating the parameters of the 2D-GSB model, as discussed below.

The distribution of bivariate series

(μ_{t})

and

(Y_{t}),

as non-stationary components of the GSB process, can be described as follows.

Theorem 4.

Let

(Y_{t})

and

(μ_{t})

be the bivariate time series defined, respectively, by Equations (1) and (3), where

μ_{0} \overset{a s}{=} {(m_{1}, m_{2})}^{⊤} (c o n s t) .

Then, for any

x = {(x_{1}, x_{2})}^{⊤} \in R^{2}

and

t = 0,1, 2, \dots

, the CDFs of these series are

F_{μ} (x, t) ≔ P \{μ_{t} < x\} = \sum_{k = 0}^{t} π_{k}^{(t)} (c) F_{k} (x), F_{Y} (x, t) ≔ P \{Y_{t} < x\} = \sum_{k = 0}^{t} π_{k}^{(t)} (c) F_{k + 1} (x),

(19)

where

π_{k}^{(t)} (c) ≔ (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k}

and

F_{k} (x),

k = 0, \dots, t

are the CDFs of the bivariate Gaussian RVs

N (μ_{0}, k Σ)

. Furthermore, the following convergences (in distribution) hold:

t^{- 1 / 2} μ_{t} \overset{d}{⟶} N (0, p_{c} Σ), t^{- 1 / 2} Y_{t} \overset{d}{⟶} N (0, p_{c} Σ), t \to + \infty .

(20)

Proof.

Let us define the bivariate RVs

ξ_{t} = q_{t} ε_{t}

, where

t = 0, 1, \dots

It can be easily proven that

(ξ_{t})

is a series of mutually uncorrelated RVs, with

E (ξ_{t}) = 0_{2 \times 1}

and

C o v (ξ_{t}) = p_{c} Σ

. By reapplying conditional probabilities, the CDF of RVs

(ξ_{t})

is obtained as

F_{ξ} (x) ≔ P \{ξ_{t} < x\} = P \{ξ_{t} < x| q_{t} = 1\} \cdot P \{q_{t} = 1\} + P \{ξ_{t} < x| q_{t} = 0\} \cdot P \{q_{t} = 0\} = P \{ε_{t} < x\} \cdot p_{c} + P \{0 < x\} \cdot (1 - p_{c}) = p_{c} F_{ε} (x) + (1 - p_{c}) F_{δ} (x),

where

F_{δ} (x) = I (x > 0)

is the CDF of the bivariate RV

δ_{0} \overset{a s}{=} 0 .

According to this, for the CF of the RVs

(ξ_{t})

one obtains

φ_{ξ} (u) = E (\exp i 〈u, ξ_{t}〉) = p_{c} φ_{ε} (u) + (1 - p_{c}) φ_{δ} (u) = p_{c} \exp (- \frac{1}{2} u^{T} Σ u) + (1 - p_{c}),

where

φ_{δ} (u) \equiv 1

is the CF of the RV

δ_{0}

. Now, by applying Equation (3), for the CFs of the martingale means

(μ_{t})

we get

φ_{μ} (u, t) = E (\exp i 〈u, μ_{t}〉) = \exp (i 〈u, μ_{0}〉) E (\prod_{s = 0}^{t - 1} \exp (i q_{s} 〈u, ε_{s}〉)) = \exp (i u^{⊤} μ_{0}) {(p_{c} \exp (- \frac{1}{2} u^{⊤} Σ u) + (1 - p_{c}))}^{t} = \exp (i u^{⊤} μ_{0}) \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k} \exp (- \frac{1}{2} k u^{⊤} Σ u),

(21)

where the binomial formula is used in the last equality. From here, by applying Lévy’s correspondence theorem (see, e.g., [19], Section 14.2), the first part of Equation (19) immediately follows. Similarly, using Equations (1) and (21), for the CFs of series

(Y_{t})

one obtains:

φ_{Y} (u, t) = E (\exp i 〈u, Y_{t}〉) = φ_{μ} (u, t) \cdot φ_{ε} (u) = \exp (i u^{⊤} μ_{0}) {(p_{c} \exp (- \frac{1}{2} u^{⊤} Σ u) + (1 - p_{c}))}^{t} E (\exp (i 〈u, ε_{t}〉)) = \exp (i u^{⊤} μ_{0} - \frac{1}{2} u^{⊤} Σ u) \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k} \exp (- \frac{1}{2} k u^{⊤} Σ u) = \exp (i u^{⊤} μ_{0}) \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k} \exp (- \frac{1}{2} (k + 1) u^{⊤} Σ u) .

(22)

Thus, by reapplying Lévy’s theorem, we get the second equation in (19).

To prove the second part of the theorem, i.e., Equation (20), note that the CFs of the bivariate RVs

t^{- 1 / 2} μ_{t}

and

t^{- 1 / 2} Y_{t}

,

t = 1, 2, \dots,

according to previous Equations (21) and (22), can be written as follows:

φ_{μ} (t^{- 1 / 2} u, t) = \exp (\frac{i u^{⊤} μ_{0}}{\sqrt{t}}) {(1 + p_{c} (\exp (- \frac{1}{2 t} u^{⊤} Σ u) - 1))}^{t}, φ_{Y} (t^{- 1 / 2} u, t) = \exp (\frac{i u^{⊤} μ_{0}}{\sqrt{t}}) {(1 + p_{c} (\exp (- \frac{1}{2 t} u^{⊤} Σ u) - 1))}^{t} \exp (- \frac{1}{2 t} u^{⊤} Σ u) .

After taking the logarithm and developing the exponential terms, when

t \to + \infty

, we get

\log φ_{μ} (t^{- 1 / 2} u, t) = \frac{i u^{⊤} μ_{0}}{\sqrt{t}} + t \log (1 + p_{c} (\exp (- \frac{1}{2 t} u^{⊤} Σ u) - 1)) = \frac{i u^{⊤} μ_{0}}{\sqrt{t}} - \frac{p_{c}}{2} u^{⊤} Σ u + O (t^{- 2}), \log φ_{Y} (t^{- 1 / 2} u, t) = \log φ_{μ} (t^{- 1 / 2} u, t) - \frac{u^{⊤} Σ u}{2 t} = \frac{i u^{⊤} μ_{0}}{\sqrt{t}} - \frac{p_{c}}{2} u^{⊤} Σ u - \frac{1}{2 t} u^{⊤} Σ u + O (t^{- 2}),

whence it follows:

\lim_{t \to + \infty} φ_{μ} (t^{- 1 / 2} u, t) = \lim_{t \to + \infty} φ_{Y} (t^{- 1 / 2} u, t) = \exp (- \frac{p_{c}}{2} u^{⊤} Σ u) .

Obviously, the limit thus obtained is the CF of the bivariate Gaussian distribution

N (0, p_{c} Σ)

, which confirms both convergences in Equation (20). □

Remark 4.

Note that, based on the previous results, the CFs of both processes

(μ_{t})

and

(Y_{t})

are the convex combinations (mixtures) of those Gaussian CFs with binomial weights. Due to this, the noise indicators

q_{t} ~ B e r n o u l l i (p_{c})

,

t = 1,2, \dots

, are the mutually independent Bernoulli RVs, which implies:

K_{t} ≔ \sum_{s = 1}^{t} q_{s} ~ B i n o m i a l (t, p_{c}) .

Thus, according to Equation (19), it follows that, conditionally on

K_{t} = k

, both processes

(μ_{t})

and

(Y_{t})

are Gaussian:

μ_{t} |(K_{t} = k) ~ N (μ_{0}, k Σ), Y_{t} |(K_{t} = k) ~ N (μ_{0}, (k + 1) Σ) .

Consequently, their marginals are finite Gaussian mixtures with binomial weights:

μ_{t} ~ \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k} N (μ_{0}, k Σ), Y_{t} ~ \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} {(1 - p_{c})}^{t - k} N (μ_{0}, (k + 1) Σ) .

This representation explicitly reveals the latent binomial–Gaussian mixture structure underlying the non-stationary processes

(μ_{t})

and

(Y_{t})

. Moreover, Equation (20) shows that even these non-stationary bivariate series

(μ_{t})

and

(Y_{t})

generate asymptotic normal scaled processes

(t^{- 1 / 2} μ_{t})

and

(t^{- 1 / 2} Y_{t})

as

t \to + \infty

. This behavior reflects a specific form of random thinning of bivariate Gaussian noise

(ε_{t})

, where each innovation is incorporated into the cumulative dynamics with probability

p_{c}

. In the limit, the resulting distribution remains Gaussian, but with a reduced covariance matrix

p_{c} Σ

. This property is particularly relevant for practical application of the 2D-GSB process and is illustrated in Figure 3, which shows the convergence of the moduli of the CFs of

(μ_{t})

and

(Y_{t})

for increasing time indices.

Similar to the one-dimensional case, some additional asymptotic properties of the non-stationary bivariate series

(μ_{t})

and

(Y_{t})

can be described in terms of their (linear) transformations. This refers to the so-called scaled averages, which provide the possibility of finding convergences in the distribution and asymptotically normal (AN) distributions. In that sense, the following statement, named the central limit theorem (CLT) of the 2D-GSB process, can be formulated:

Theorem 5.

For arbitrary

α \geq 1,

let us define the so-called

α

-mean series

M_{t; α} = \frac{1}{t^{α}} \sum_{s = 1}^{t} μ_{s}, U_{t; α} = \frac{1}{t^{α}} \sum_{s = 1}^{t} Y_{s},

where

(Y_{t})

and

(μ_{t})

are the non-stationary bivariate time series defined by Equations (1) and (3), respectively. Then, the following statements hold:

(i): When $1 \leq α \leq 3 / 2,$ the time series $M_{t; α}$ and $Y_{t; α}$ are asymptotically normally distributed, i.e., the following relations, when $t \to + \infty$ , hold:

$M_{t; α} ~ N (μ_{0} t^{1 - α}, \frac{p_{c}}{3} t^{3 - 2 α} Σ), U_{t; α} ~ N (μ_{0} t^{1 - α}, (t^{1 - 2 α} + \frac{p_{c}}{3} t^{3 - 2 α}) Σ) .$

(23)
(ii): When $α > 3 / 2,$ the time series $M_{t; α}$ and $Y_{t; α}$ asymptotically vanish, i.e.,

$M_{t; α} \overset{d}{⟶} δ_{0}, U_{t; α} \overset{d}{⟶} δ_{0}, t \to + \infty .$

(24)

Proof.

First, we shall prove the statement of the theorem for the bivariate series

(M_{t; α})

. According to the definition of the series

(μ_{t})

and

(M_{t; α}),

one obtains:

M_{t; α} = \frac{1}{t^{α}} \sum_{s = 1}^{t} μ_{s} = \frac{1}{t^{α}} \sum_{s = 1}^{t} (μ_{0} + \sum_{k = 0}^{s - 1} q_{k} ε_{k}) = \frac{1}{t^{α}} (t μ_{0} + \sum_{s = 1}^{t} \sum_{k = 0}^{s - 1} ξ_{k}) = t^{1 - α} μ_{0} + \sum_{k = 1}^{t} \frac{k}{t^{α}} ξ_{t - k} .

Thus, the bivariate series

(M_{t; α})

is a sum of uncorrelated bivariate RVs

μ_{0}

and

ξ_{s} = q_{s} ε_{s}

,

s = 0, \dots, t - 1

, so the CFs of

(M_{t; α})

can be obtained as follows:

φ_{M} (u, t, α) = φ_{μ} (t^{1 - α} μ_{0}, 0) \prod_{k = 1}^{t} φ_{ξ} (\frac{k}{t^{α}} u) = \exp (i t^{1 - α} u^{⊤} μ_{0}) \prod_{k = 1}^{t} [1 + p_{c} (\exp (- \frac{k^{2}}{2 t^{2 α}} u^{⊤} Σ u) - 1)] .

Taking the logarithm of the CFs

φ_{M} (u, t, α)

gives a function

ψ_{M} (u, t, α) ≔ \log φ_{M} (u, t, α) = i t^{1 - α} u^{⊤} μ_{0} + \sum_{k = 1}^{t} \log [1 + p_{c} (\exp (- \frac{k^{2}}{2 t^{2 α}} u^{⊤} Σ u) - 1)],

and similarly to the previous theorem, taking its asymptotic value when

t \to + \infty,

we get:

ψ_{M} (u, t, α) = i t^{1 - α} u^{T} μ_{0} - \frac{p_{c}}{2 t^{2 α}} \sum_{k = 1}^{t} k^{2} u^{⊤} Σ u + O (t^{3 - 2 α}) = i t^{1 - α} u^{⊤} μ_{0} - \frac{p_{c} t (t + 1) (2 t + 1)}{12 t^{2 α}} u^{⊤} Σ u + O (t^{3 - 2 α}) = i t^{1 - α} u^{⊤} μ_{0} - \frac{p_{c} t^{3 - 2 α}}{6} u^{⊤} Σ u + O (t^{3 - 2 α}) \approx \{\begin{matrix} i t^{1 - α} u^{⊤} μ_{0} - \frac{p_{c} t^{3 - 2 α}}{6} u^{⊤} Σ u, & 1 \leq α \leq 3 / 2 \\ 0, & α > 3 / 2 . \end{matrix}

By substituting the last term in the CFs

φ_{M} (u, t, α)

and applying Lévy’s correspondence theorem, the first relations in Equations (23) and (24) are easily obtained.

The proof for the series

U_{t; α}

can be carried out in an analogous way. Using Equation (1), as well as the previously proven facts, we have that

U_{t; α} = t^{- α} \sum_{j = 1}^{t} (μ_{j} + ε_{j}) = M_{t; α} + t^{- α} \sum_{j = 1}^{t} ε_{j},

where

t^{- α} \sum_{j = 1}^{t} ε_{j} ~ N (0, t^{1 - 2 α} Σ) .

From here, for the CFs of the series

(U_{t; α})

one obtains:

φ_{U} (u, t, α) = φ_{M} (u, t, α) \prod_{j = 1}^{t} φ_{ε} (t^{- α} u) = \exp (i t^{1 - α} u^{T} μ_{0}) \exp (- \frac{t^{1 - 2 α}}{2} u^{T} Σ u) \prod_{k = 1}^{t} [1 + p_{c} (\exp (- \frac{k^{2}}{2 t^{2 α}} u^{T} Σ u) - 1)] .

Using a similar procedure as in the previous part of the proof, i.e., taking the logarithm of the function

φ_{U} (u, t, α),

we get

ψ_{U} (u, t, α) ≔ \log φ_{U} (u, t, α) = i t^{1 - α} u^{T} μ_{0} - \frac{t^{1 - 2 α}}{2} u^{T} Σ u + \sum_{k = 1}^{t} \log [1 + p_{c} (\exp (- \frac{k^{2}}{2 t^{2 α}} u^{T} Σ u) - 1)],

and by taking asymptotic value when

t \to + \infty

, it follows:

ψ_{U} (u, t, α) \approx \{\begin{matrix} i t^{1 - α} u^{T} μ_{0} - (\frac{t^{1 - 2 α}}{2} + \frac{p_{c} t^{3 - 2 α}}{6}) u^{T} Σ u, & 1 \leq α \leq 3 / 2 \\ 0, & α > 3 / 2 . \end{matrix}

Therefore, by substituting this expression into the CFs

φ_{U} (u, t, α)

, the second relations in Equations (23) and (24) are obtained, which fully proves the theorem. □

Remark 5.

As an illustration, Figure 4 shows the moduli of the CFs for the scaled series

M_{t; α}

and

U_{t; α}

, where several representative values of

α

are chosen. It should be noted that for

0 < α < 1

, the normalization is too weak to stabilize the growth of the non-stationary components, and both scaled processes asymptotically have infinite mean and variance. Therefore, this regime does not admit a meaningful asymptotic interpretation and is not considered. The remaining cases mentioned in Theorem 5 can be interpreted as follows:

When $1 \leq α < 3 / 2$ , both series $M_{t; α}$ and $Y_{t; α}$ have an approximately bivariate Gaussian distribution, with the mean converging to zero, while their variance–covariance matrices diverge.
When $α > 3 / 2$ , both series converge (in distribution) to the zero-vector $0 : = {(0,0)}^{⊤} .$ Hence, according to the well-known facts (see, e.g., Billingsley [20]), the convergence in probability also holds, i.e., $M_{t; α} \overset{P}{⟶} 0, U_{t; α} \overset{P}{⟶} 0, t \to + \infty$ .
The case $α = 3 / 2$ is of special interest, because the asymptotic relations in Equation (23) then become:

t^{- 3 / 2} \sum_{j = 1}^{t} μ_{j} \overset{d}{⟶} N (0, \frac{p_{c}}{3} Σ), t^{- 3 / 2} \sum_{j = 1}^{t} Y_{j} \overset{d}{⟶} N (0, \frac{p_{c}}{3} Σ), t \to + \infty .

(25)

Note that these convergences, compared to those given by Equation (20), have smaller variances, which is useful in the practical application of the 2D-GSB process.

4. Parameters Estimation

This section addresses the estimation of the unknown parameters of the 2D-GSB process, including the critical threshold

c

(or equivalently the regime probability

p_{c}

), the mean vector

μ_{0} = {(m_{1}, m_{2})}^{⊤}

, and the elements of the covariance matrix (

σ_{1}^{2}, σ_{2}^{2}, ρ

). To this end, note that although the original 2D-GSB components

(Y_{t})

and

(μ_{t})

are non-stationary, their asymptotic results are derived under appropriate scaling, as stated in Theorems 4 and 5. Thus, the estimation of the mean structure relies on these scaling results and on the observed series

(Y_{t})

. In contrast, the estimation of the remaining parameters is primarily based on the stationary increment process

(X_{t})

, whose stochastic properties are formally established by Theorems 2 and 3. Note the specific mixture-based structure of

(X_{t})

requires additional considerations and a careful assessment of estimator performance. At the beginning, the method of moments (MoM) is applied first, followed by the empirical characteristic function (ECF) approach. Under suitable regularity conditions, the asymptotic properties of both estimators are established, and their finite-sample performance is also compared.

4.1. Moment-Based Estimators

Let

(X_{t})

,

t = 0,1, \dots, n

be some realization of the increments of the 2D-GSB process, where

X_{0} = {(0,0)}^{⊤}

. According to Theorem 2, that is, Equation (13), for the covariance of the series

(X_{t})

is valid

Γ_{X} (0) = C o v (X_{t}, X_{t}) = (2 - p_{c}) \cdot Σ, Γ_{X} (1) = C o v (X_{t}, X_{t - 1}) = (p_{c} - 1) \cdot Σ,

from where it follows:

Γ_{X} (1) = \frac{p_{c} - 1}{2 - p_{c}} \cdot Γ_{X} (0) .

According to this and using the sample covariance matrices

{\hat{Γ}}_{X} (0) = \frac{1}{n} \sum_{t = 1}^{n} X_{t} X_{t}^{⊤}, {\hat{Γ}}_{X} (1) = \frac{1}{n - 1} \sum_{t = 2}^{n} X_{t} X_{t - 1}^{⊤}

the following estimator of the proportion

r = (p_{c} - 1) / (2 - p_{c})

can be calculated

\hat{r} = \frac{t r {\hat{Γ}}_{X} (1)}{t r {\hat{Γ}}_{X} (0)},

(26)

where

t r {\hat{Γ}}_{X} (0)

and

t r {\hat{Γ}}_{X} (1)

are the traces of the given matrices. Then, for the estimator of the parameter

p_{c}

one obtains:

{\hat{p}}_{c} = \frac{1 + 2 \hat{r}}{1 + \hat{r}} .

(27)

Based on the estimator

{\hat{p}}_{c}

, the corresponding estimator of the critical value

c = \hat{c}

can be determined as a solution to the equation:

P \{{‖ε_{t}‖}^{2} > c\} = {\hat{p}}_{c} .

According to Equations (14) and (26), it can be easily seen that

{\hat{p}}_{c}

and

\hat{c}

are the appropriate estimators if the following inequalities hold:

0 < {\hat{p}}_{c} < 1 ⟺ - 0.5 < \hat{r} < 0 .

Note that these conditions are equivalent to those for a one-dimensional GSB process (see, e.g., Jovanović et al. [12]). On the other hand, using the estimator

{\hat{p}}_{c}

, the covariance matrix of the 2D-GSB process can be estimated from the equation:

\hat{Σ} = \frac{1}{2 - {\hat{p}}_{c}} {\hat{Γ}}_{X} (0) .

(28)

For the estimators

{\hat{p}}_{c}

and

\hat{Σ}

, given by Equations (27) and (28), respectively, we say that they represent MoM estimators for the parameter

p_{c}

and covariance matrix

Σ

. At the same time, their consistency and asymptotic normality can be proven. To this end, we first prove the strict law of large numbers (SLLN) and almost surely (a.s.) consistency for the so-called moment-vectors, used in the construction of the MoM estimators.

Theorem 6.

Let

(X_{t})

be the increment series of the 2D-GSB process, defined by Equations (11) and (12). Then, for the vectors

Z_{t} ≔ v e c (X_{t} X_{t}^{⊤}, X_{t} X_{t - 1}^{⊤}) \in R^{8}

the following statements hold:

(i): $E {‖Z_{t}‖}^{2} < + \infty$ , for any $t = 1,2, \dots, n$ .
(ii): The series $(Z_{t})$ is 2-dependent, i.e., each $Z_{t}$ and $Z_{s}$ are independent when $|t - s| > 2$ .
(iii): The moment-vector $V_{n} ≔ n^{- 1} \sum_{t = 1}^{n} Z_{t}$ converges almost surely to $E (Z_{1})$ , i.e.,

$\frac{1}{n} \sum_{t = 1}^{n} Z_{t} \overset{a . s .}{⟶} E (Z_{1}), n \to + \infty .$

(29)
(iv): The series $(V_{n})$ is asymptotically normal, i.e.,

$\sqrt{n} (V_{n} - E (Z_{1})) \overset{d}{⟶} N (0, Ω), n \to + \infty,$

(30)

where $Ω ≔ \sum_{h = - 2}^{2} Cov (Z_{t}, Z_{t + h}) = Var (Z_{t}) + 2 (Cov (Z_{t}, Z_{t + 1}) + Cov (Z_{t}, Z_{t + 2}))$ .

Proof.

(i) Since the components of the vector

Z_{t}

are the products of the components of the vectors

X_{t}

and

X_{t - 1}

, it is sufficient to show that the fourth moments of the increments are finite, i.e.,

E {‖X_{t}‖}^{4} < + \infty

, when

t = 1,2, \dots, n .

For this purpose, we use Jensen’s inequality (see, e.g., [21])

{(a + b)}^{p} \leq 2^{p - 1} (a^{p} + b^{p}),

which holds for any

a, b \geq 0

and

p \geq 1 .

By putting

p = 4

and applying this inequality on Equation (11), one obtains:

| | X_{t} {| |}^{4} \leq 8 (| | ε_{t} | |^{4} + | | r_{t - 1} ε_{t - 1} | |^{4}) = 8 (| | ε_{t} | |^{4} + r_{t - 1} | | ε_{t - 1} | |^{4}) .

Taking the expectation and using the inequality

r_{t - 1} \leq 1

, as well as the stationarity of the innovations

(ε_{t}

), it follows from here that:

E | | X_{t} {| |}^{4} \leq 8 (E | | ε_{t} | |^{4} + E | | ε_{t - 1} | |^{4}) = 16 E | | ε_{t} | |^{4} .

In doing so, the series

ε_{t} = {(ε_{t}^{(1)}, ε_{t}^{(2)})}^{T} \sim N (0, Σ)

have the finite moments of all orders, in particular:

E | | ε_{t} | |^{4} = E ({(ε_{t}^{(1)})}^{4} + 2 {(ε_{t}^{(1)})}^{2} {(ε_{t}^{(2)})}^{2} + {(ε_{t}^{(2)})}^{4}) = 3 σ_{1}^{4} + 2 σ_{1}^{2} σ_{2}^{2} (1 + 2 ρ^{2}) + 3 σ_{2}^{4} .

Thus,

E | | ε_{t} | |^{4} < + \infty

holds, and therefore

E | | X_{t} | |^{4} < + \infty

.

(ii) By the definition of the increment

(X_{t})

, they depend (only) on the RVs

ε_{t}

and

ε_{t - 1}

, for each

t = 1,2 \dots

Since

(ε_{t})

is an IID series, it follows that

(X_{t})

is a 1-dependent time series. Similarly, the components

X_{t} X_{t - 1}^{⊤}

depend on

ε_{t}, ε_{t - 1}, ε_{t - 2}

, while

X_{t} X_{t}^{⊤}

depends on

ε_{t}, ε_{t - 1}

. Therefore,

Z_{t}

depends at most on the set of RVs

{ε_{t - 2}, ε_{t - 1}, ε_{t}}

, so the vector series

(Z_{t})

is indeed 2-dependent.

(iii) Since the series

(Z_{t})

is 2-dependent, we divide the indices into following three subsets (with remainder modulo 3):

A_{0} = {1, 4, 7, \dots}, A_{1} = {2, 5, 8, \dots}, A_{2} = {3, 6, 9, \dots} .

Thus, for a fixed

j \in {0, 1, 2}

, the elements of sets

S_{j} ≔ {Z_{t} | t \in A_{j}}

are independent, and according to (i), they have a common finite moment

E {‖Z_{t}‖}^{2}

. By applying Kolmogorov’s SLLN for an IID series (see, e.g., [22]), we get

\frac{1}{|A_{j} (n)|} \sum_{t \in A_{j}, t \leq n} Z_{t} \overset{a s}{⟶} E (Z_{1}), n \to + \infty,

where

|A_{j} (n)|

is the number of elements of the set

A_{j}

that are less or equal to

n

. According to this, it follows that

|A_{j} (n)| / n \to 1 / 3,

when

n \to + \infty,

which implies:

\frac{1}{n} \sum_{t = 1}^{n} Z_{t} = \sum_{j = 0}^{2} \frac{|A_{j} (n)|}{n} \cdot \frac{1}{|A_{j} (n)|} \sum_{t \in A_{j}, t \leq n} Z_{t} \overset{a . s .}{⟶} \sum_{j = 0}^{2} \frac{1}{3} E (Z_{1}) = E (Z_{1}), n \to + \infty .

This proves the convergence in Equation (29).

(iv) Let

a \in R^{2}

be an arbitrary fixed vector. According to the above,

U_{t} ≔ a^{⊤} Z_{t}

is a stationary and 2-dependent series of scalar RVs, with a finite second moment:

E (U_{t}^{2}) \leq | | a | |^{2} E | {| Z}_{t} | |^{2} < + \infty .

Hence, by applying the Hoeffding–Robbins CLT for

m

-dependent scalar series [23], one obtains

\sqrt{n} ({\bar{U}}_{n} - E (U_{1})) \overset{d}{⟶} N (0, σ_{a}^{2}), n \to + \infty,

where

{\bar{U}}_{n} = n^{- 1} \sum_{k = 1}^{n} U_{k}

and

σ_{a}^{2} = \sum_{h = - m}^{m} Cov (U_{t}, U_{t + h}) = a^{⊤} (\sum_{h = - m}^{m} C o v (Z_{t}, Z_{t + h})) a = a^{⊤} Ω a,

with

m = 2

. Since the above is true for every

a \in R^{2}

, the Cramér–Wold decomposition [24] implies multivariate normality as given by Equation (30). □

As a consequence of the previous theorem, it follows:

Corollary 1.

Let

θ_{0} = (p_{0}, Σ_{0})

be the true value of the unknown parameter vector

θ = (p_{c}, Σ)

of the 2D-GSB process, where

p_{0} \in (0,1)

and

Σ_{0}

is a

2 \times 2

positive-definite matrix. Then, the estimator

{\hat{θ}}_{n} = ({\hat{p}}_{c}, \hat{Σ}),

defined by Equations (26)–(28), is strongly consistent for

θ_{0}

, i.e.,

{\hat{θ}}_{n} \overset{a s}{⟶} θ_{0}, n \to + \infty .

(31)

In addition,

{\hat{θ}}_{n}

is an asymptotically normal estimator for

θ_{0}

, i.e.,

\sqrt{n} ({\hat{θ}}_{n} - θ_{0}) \overset{d}{⟶} N (0, V), n \to + \infty,

(32)

where

V = D^{⊤} Ω D,

D = {\partial m (θ) / \partial θ^{⊤}|}_{θ = θ_{0}},

m (θ) = {(f (r (θ)), g (θ))}^{⊤},

f (r) = (2 r + 1) / (1 + r),

r (θ) = t r Γ (1, θ) / t r Γ (0, θ),

g (θ) = Γ (0, θ) / (2 - f (r (θ))),

Γ (h; θ) = C o v_{θ} (X_{t}, X_{t - h})

.

Proof.

It is obvious that

θ \mapsto m (θ)

is an injective continuous mapping, well-defined in some neighborhood of the true parameters

θ_{0}

. Thus, according to the continuity of almost-sure convergence, as well as the convergence in distribution (see, e.g., Serfling [25] (pp. 24)), the statement of the theorem immediately follows. □

Remark 6.

According to Corollary 1 and Equations (26)–(28), it is valid that

{\hat{p}}_{c} = f (\hat{r})

. In addition, the vectors

(A_{n}, B_{n})^{⊤} = {(tr {\hat{Γ}}_{X} (0), tr {\hat{Γ}}_{X} (1))}^{⊤}

are linear functionals of the moment-vector

V_{n}

, defined in Theorem 5, which implies

\sqrt{n} (\begin{matrix} A_{n} - A_{0} \\ B_{n} - B_{0} \end{matrix}) \overset{d}{\to} N (0, Σ_{A B}),

where

A_{0} = t r Γ_{X} (0)

,

B_{0} = t r Γ_{X} (1)

, and

Σ_{A B} = \underset{n \to \infty}{l i m} C o v (\sqrt{n} (\begin{matrix} A_{n} - A_{0} \\ B_{n} - B_{0} \end{matrix})) = \underset{n \to \infty}{l i m} (\begin{matrix} V a r (\sqrt{n} A_{n}) & C o v (\sqrt{n} A_{n}, \sqrt{n} B_{n}) \\ C o v (\sqrt{n} A_{n}, \sqrt{n} B_{n}) & V a r (\sqrt{n} B_{n}) \end{matrix})

denotes the asymptotic covariance matrix of the vector

(A_{n}, B_{n})^{⊤}

, which is obtained as a linear transformation of the limiting covariance matrix

Ω,

given by Equation (30). Thus, after some computation, for the asymptotic variance of

{\hat{p}}_{c}

one obtains

V_{1} = f^{'} {(r_{0})}^{2} \cdot Var (\sqrt{n} \hat{r}) = \frac{B_{0}^{2} V a r (A_{n}) - 2 A_{0} B_{0} C o v (A_{n}, B_{n}) + A_{0}^{2} V a r (B_{n})}{A_{0}^{4} (1+ r_{0})^{4}},

where

r_{0}

denotes the true value of the ratio

r (θ_{0}) .

This shows that the asymptotic efficiency of the estimator

{\hat{p}}_{c}

is fully determined by the second-order dependence structure of the increment process

(X_{t})

, summarized through the scalar statistic

r

. Moreover, it allows for a simplified variance estimation in practical applications, without requiring the full covariance matrix

Ω

. Finally, it should be noted that similar considerations are also possible for the estimator

\hat{Σ}

, defined by Equation (28).

4.2. ECF Estimators

In this section, the unknown parameters of the 2D-GSB process are estimated using the ECF method, which was first rigorously developed in the pioneering works of Knight and Yu [26] and Yu [27]. Subsequent extensions of CF-based estimators have been discussed by Kotchoni [28], Meintanis et al. [29,30] and other authors. Following these contributions, an ECF procedure similar to those in Stojanović et al. [13] and Ljajko et al. [15] is employed here. The key idea underlying the ECF method is the bijective correspondence between CFs and their corresponding CDFs, implying that ECF retains all distributional information contained in the sample. Moreover, theoretical CFs are uniformly bounded, which contributes to the numerical stability of the resulting estimators. To this end, a general expression for the

r

-th order CFs (

r \geq 1

) of the increment series (

X_{t}

) of the 2D-GSB process is first presented:

Theorem 7.

Let the bivariate series of increments

(X_{t})

be defined by Equations (11) and (12), and

θ = (p_{c}, Σ)

is a vector of unknown parameters. Then, for any block length

ℓ \geq 1

and any vector

u = (u_{1}, \dots, u_{ℓ}) \in (R^{2})^{ℓ}

the

ℓ

-dimensional characteristic function

φ_{X}^{(ℓ)} (u; θ) = E \exp (i \sum_{k = 1}^{ℓ} u_{k}^{⊤} X_{t + k - 1})

(33)

admits the explicit representation

φ_{X}^{(ℓ)} (u; θ) = \exp (- \frac{1}{2} {‖u_{ℓ}‖}_{Σ}^{2}) \prod_{k = 0}^{ℓ - 1} [p_{c} \exp (- \frac{1}{2} {‖u_{k}‖}_{Σ}^{2}) + (1 - p_{c}) \exp (- {‖u_{k} - u_{k + 1}‖}_{Σ}^{2})]

(34)

where

u_{0} = 0_{2 \times 1}

,

p_{c} = P \{{‖ε_{t}‖}^{2} > c\}

, and

{‖u‖}_{Σ}^{2} : = u^{⊤} Σ u

, when

u \in R^{2}

.

Proof.

First, let us note that, according to Equation (18), the statement of the theorem is valid for the case when

ℓ = 1

and

u_{1} = u

. Now, assume that

ℓ > 1

and denote:

L (u; θ) ≔ \exp (i \sum_{k = 1}^{ℓ} u_{k}^{⊤} X_{t + k - 1}) = \exp [i \sum_{k = 1}^{ℓ} u_{k}^{⊤} (ε_{t + k - 1} - r_{t + k - 2} ε_{t + k - 2})] = \exp [i (u_{ℓ}^{⊤} ε_{t + ℓ - 1} + \sum_{k = 0}^{ℓ - 1} (u_{k}^{⊤} - r_{t + k - 1} u_{k + 1}^{⊤}) ε_{t + k - 1})] .

According to this and Equation (33), the

ℓ

-dimensional CF of the series

(X_{t})

is obtained as

φ_{X}^{(ℓ)} (u; θ) : = E [L (u; θ)]

, from which, after some elementary calculations, Equation (34) is easily shown. □

Next, let us denote by

\{X_{1}, \dots, X_{n}\}

some realization of length

n \in N

of the increments

(X_{t})

, as well as their corresponding

ℓ

-dimensional ECF

{\hat{φ}}_{n}^{(ℓ)} (u) ≔ \frac{1}{n - ℓ + 1} \sum_{t = 1}^{n - ℓ + 1} \exp (i u^{⊤} X_{t}^{(ℓ)}),

(35)

where

X_{t}^{(ℓ)} ≔ (X_{t}, \dots, X_{t + l - 1})

is the overlapping block of length

ℓ \geq 1 .

The basic principle of the ECF method is to minimize the “distance” between the theoretical CF and its corresponding ECF, where ECF estimators are obtained by minimizing the objective function

Q_{n}^{(ℓ)} (θ) ≔ \int_{R^{2 ℓ}} g (u) {|{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ)|}^{2} d u

(36)

with respect to the parameters

θ = (p_{c}, Σ)

. Here, we denote it as

d u ≔ d u_{1} \dots d u_{ℓ}

, and

g : R^{2 ℓ} \to R^{+}

is some weight function. Therefore, the ECF estimates are solutions to the following minimization equation:

{\hat{θ}}_{n}^{(ℓ)} = \arg \min_{θ \in Θ} Q_{n}^{(ℓ)} (θ),

(37)

where

Θ = (0,1) \times {(0, + \infty)}^{3}

is a non-trivial parameter space. Using some general results of the ECF asymptotic theory, the strong consistency and asymptotic normality (AN) of the ECF estimators, under certain regulatory conditions, can be proven as follows:

Theorem 8.

Let

θ_{0}

be the true value of the parameter

θ

, and for arbitrary

n = 1, 2, \dots,

let

{\hat{θ}}_{n}^{(ℓ)}

be the solutions of Equation (37). In addition, assume that the following regularity conditions are satisfied:

(R₁) The weight function $g (u)$ is real-valued, nonnegative and integrable, with

$\int_{R^{2 ℓ}} {‖u‖}^{2} g (u) d u < + \infty .$
(R₂) The parameter $θ$ is identifiable from the ℓ-dimensional CF, given by Equations (33) and (34), i.e., from the equality $φ_{X}^{(ℓ)} (u; θ_{1}) = φ_{X}^{(ℓ)} (u; θ_{2}),$ for almost all $u \in R^{2 ℓ},$ it follows $θ_{1} = θ_{2},$ for any $θ_{1}, θ_{2} \in Θ .$
(R₃) There exists the compact set $Θ^{'} = [0,1] \times {[0, M]}^{3} \subset \bar{Θ}$ , where M is sufficiently large so that $θ_{0}, {\hat{θ}}_{n}^{(ℓ)} \in i n t Θ^{'}$ , for any $n \geq n_{0} > 0 .$
(R₄) The function $φ_{X}^{(ℓ)} (u; θ)$ is twice continuously differentiable with respect to $θ$ , uniformly in $u \in R^{2 ℓ} .$
(R₅) ${\nabla_{θ}^{2} Q_{n}^{(ℓ)} (θ)|}_{θ = θ_{0}}$ is a positive definite, regular, non-zero matrix.
Then, for any $ℓ \geq 2,$ ${\hat{θ}}_{n}^{(ℓ)}$ is a strictly consistent and asymptotically normal estimator for θ.

Proof.

First, relabel the ECF and the

ℓ

-blocks, defined by Equation (35), as follows:

{\hat{φ}}_{n}^{(ℓ)} (u) = \frac{1}{n - ℓ + 1} \sum_{t = 1}^{n - ℓ + 1} W_{t} (u), W_{t} (u) : = \exp (i \sum_{j = 1}^{ℓ} u_{j}^{⊤} X_{t + j - 1}) .

By construction of the increment process (

X_{t})

, for each fixed

u \in R^{2 ℓ}

the series

(W_{t} (u))_{t \geq 1}

is strictly stationary and

ℓ + 1

-dependent. Furthermore, the equality

|W_{t} (u)| = 1

holds for any

t \geq 1

and

u \in R^{2 ℓ}

, so that

W_{t} (u) \in L^{1}

. Hence, by applying the SSLN for

m

-dependent stationary series (see, e.g., [31]), it follows that

{\hat{φ}}_{n}^{(ℓ)} (u) \overset{a . s .}{\to} E [W_{1} (u)] = φ_{X}^{(ℓ)} (u; θ_{0}),

for each fixed

u \in R^{2 ℓ} .

According to the continuity of the CFs defined in (33) and (34), the pointwise convergence above implies uniform convergence on any compact

C \subset R^{2 ℓ},

i.e.,

\underset{u \in C}{s u p} |{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ_{0})| \overset{a . s .}{\to} 0, n \to + \infty .

(38)

Further, if we define the function

Q_{0}^{(ℓ)} (θ) : = \int_{R^{2 ℓ}} g (u) {|φ_{X}^{(ℓ)} (u; θ_{0}) - φ_{X}^{(ℓ)} (u; θ)|}^{2} d u \geq 0,

then, according to condition (R₁), it is well defined and continuous on the compact

Θ^{'}

. Additionally, from the equality

Q_{0}^{(ℓ)} (θ) = 0

and condition (R₂) it follows that

θ = θ_{0}

; that is,

Q_{0}^{(ℓ)} (θ)

has a unique minimum at the point

θ_{0}

.

On the other hand, since both the empirical and theoretical CFs are uniformly bounded by one, using a similar procedure as in Knight and Yu [26], one obtains:

|Q_{n}^{(ℓ)} (θ) - Q_{0}^{(ℓ)} (θ)| = \int_{R^{2 ℓ}} g (u) ({|{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ)|}^{2} - {|φ_{X}^{(ℓ)} (u; θ_{0}) - φ_{X}^{(ℓ)} (u; θ)|}^{2}) d u \leq \int_{R^{2 ℓ}} g (u) |{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ_{0})| (|{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ)| + |φ_{X}^{(ℓ)} (u; θ_{0}) - φ_{X}^{(ℓ)} (u; θ)|) d u \leq 4 \int_{R^{2 ℓ}} g (u) |{\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ_{0})| d u .

Hence, convergence (38) and condition (R₃) yield the uniform convergence:

\underset{θ \in Θ^{'}}{s u p} |Q_{n}^{(ℓ)} (θ) - Q_{0}^{(ℓ)} (θ)| \overset{a . s .}{\to} 0, n \to + \infty .

Finally, by using Theorem 2.1 in Newey and McFadden [32], we get

{\hat{θ}}_{n}^{(ℓ)} \overset{a . s .}{\to} θ_{0}, n \to + \infty,

that is, the ECF estimator

{\hat{θ}}_{n}^{(ℓ)}

is strictly consistent.

Let us show the AN property of

{\hat{θ}}_{n}^{(ℓ)} .

According to condition (R₄), the function

Q_{n}^{(ℓ)} (θ)

is twice differentiable near the point

θ = {\hat{θ}}_{n}^{(ℓ)}

. Then, using the Taylor expansion of this function, one obtains

\nabla_{θ} Q_{n}^{(ℓ)} ({\hat{θ}}_{n}^{(ℓ)}) = \nabla_{θ} Q_{n}^{(ℓ)} (θ_{0}) + \nabla_{θ}^{2} Q_{n}^{(ℓ)} ({\bar{θ}}_{n}) ({\hat{θ}}_{n}^{(ℓ)} - θ_{0}),

(39)

where

{\bar{θ}}_{T}

is between

{\hat{θ}}_{n}^{(ℓ)}

and

θ_{0} .

By definition of the ECF estimator

{\hat{θ}}_{n}^{(ℓ)},

given by Equation (37), the equality

\nabla_{θ} Q_{n}^{(ℓ)} ({\hat{θ}}_{n}^{(ℓ)}) = 0

holds, and from Equation (39) it follows:

\sqrt{n} ({\hat{θ}}_{n}^{(ℓ)} - θ_{0}) = - (\nabla_{θ}^{2} Q_{n}^{(ℓ)} ({\bar{θ}}_{n}))^{- 1} \sqrt{n} \nabla_{θ} Q_{n}^{(ℓ)} (θ_{0}) .

(40)

According to the mentioned properties of the function

Q_{n}^{(ℓ)} (θ)

, it can be differentiated under the integral sign, i.e.,

\nabla_{θ} Q_{n}^{(ℓ)} (θ) = - 2 \int_{R^{2 ℓ}} g (u) ({\hat{φ}}_{n}^{(ℓ)} (u) - φ_{X}^{(ℓ)} (u; θ)) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ) d u

(41)

and:

\nabla_{θ}^{2} Q_{n}^{(ℓ)} (θ) = 2 \int_{R^{2 ℓ}} g (u) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ)^{⊤} d u + 2 \int_{R^{2 ℓ}} g (u) (φ_{X}^{(ℓ)} (u; θ) - {\hat{φ}}_{n}^{(ℓ)} (u)) \nabla_{θ}^{2} φ_{X}^{(ℓ)} (u; θ) d u .

(42)

Thus, by substituting

θ = θ_{0}

and Equation (35) into Equation (41), one obtains

\sqrt{n} \nabla_{θ} Q_{n}^{(ℓ)} (θ_{0}) = \frac{2}{\sqrt{n}} \sum_{t = 1}^{n} ψ_{t} + o_{p} (1),

where:

ψ_{t} = \int_{R^{2 ℓ}} g (u) (φ_{X}^{(ℓ)} (u; θ_{0}) - e^{i u^{⊤} X_{t}^{(ℓ)}}) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ_{0}) d u .

In doing so, the series

(ψ_{t})

depends on

ε_{t - 1}, \dots, ε_{t + ℓ - 1},

so it is strictly stationary,

(ℓ+ 1)

-dependent, with

E {‖ψ_{t}‖}^{2} < \infty .

Hence, the central limit theorem for a strictly stationary,

(ℓ+ 1)

-dependent series with a finite second moment gives

\frac{1}{\sqrt{n}} \sum_{t = 1}^{n} ψ_{t} \overset{d}{\to} N (0, D),

(43)

where

D = \sum_{k = - ℓ}^{ℓ} C o v (ψ_{0}, ψ_{k}) .

In addition, the convergence is valid

\nabla_{θ}^{2} Q_{n}^{(ℓ)} ({\bar{θ}}_{n}) \overset{a . s .}{\to} J, n \to + \infty,

(44)

where, according to condition (R₅) and Equation (42),

J = E ({\nabla_{θ}^{2} Q_{n}^{(ℓ)} (θ)|}_{θ = θ_{0}}) = 2 \int_{R^{2 ℓ}} g (u) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ_{0}) \nabla_{θ} φ_{X}^{(ℓ)} (u; θ_{0})^{⊤} d u < + \infty .

Thus, Equations (40), (43) and (44) yield

\sqrt{n} ({\hat{θ}}_{n}^{(ℓ)} - θ_{0}) \overset{d}{\to} N (0, J^{- 1} D J^{- 1}),

that is, the ECF estimator

{\hat{θ}}_{n}^{(ℓ)}

is AN for the true parameter

θ_{0} .

□

Remark 7.

Note that the asymptotic properties of the ECF estimator for the 2D-GSB model follow the same structural principles as in the one-dimensional case (see, e.g., Stojanović et al. [13]). Thus, the multivariate nature of the increments affects only the analytic form of the CF, while stationarity, short-range dependence and identifiability ensure the validity of the standard ECF asymptotic theory. Moreover, using similar considerations as Yu [27], it can be shown that the above procedure holds if the theoretical CF is of order

ℓ \geq 1

at least equal to the number of its parameters. For that purpose,

ℓ = 2

is the minimal value ensuring the identifiability of all model parameters. By substituting the value

ℓ = 2

into Equation (34), the explicit form of the CF is as follows:

φ_{X}^{(2)} (u_{1}, u_{2}) = \exp (- \frac{1}{2} u_{2}^{⊤} Σ u_{2}) [p_{c}^{2} e x p (- \frac{1}{2} u_{1}^{⊤} Σ u_{1}) + p_{c} (1 - p_{c}) e x p (- u_{1}^{⊤} Σ u_{1}) + p_{c} (1 - p_{c}) e x p (- \frac{1}{2} (u_{1} - u_{2})^{⊤} Σ (u_{1} - u_{2})) + {(1 - p_{c})}^{2} e x p (- \frac{1}{2} (u_{1} - u_{2})^{⊤} Σ (u_{1} - u_{2}) - \frac{1}{2} u_{1}^{⊤} Σ u_{1})] .

Thus, we base the estimation procedure on the two-dimensional ECF of the vector series

X_{t}^{(2)} = (X_{t}^{⊤}, X_{t + 1}^{⊤})^{⊤} \in R^{4}, t = 1, \dots, T - 1,

whose explicit expression for

u = (u_{1}^{⊤}, u_{2}^{⊤})^{⊤} \in R^{4}

is as follows:

{\hat{φ}}_{T}^{(2)} (u_{1}, u_{2}) = \frac{1}{n - 1} \sum_{t = 1}^{n - 1} e x p (i u_{1}^{⊤} X_{t}+ i u_{2}^{⊤} X_{t + 1}) = \frac{1}{n - 1} \sum_{t = 1}^{n - 1} c o s (u_{1}^{⊤} X_{t}+ u_{2}^{⊤} X_{t + 1})

In addition, the choice of the weighting function

g (u)

plays a crucial role in the performance of the ECF estimator. In following, we adopt a smooth exponentially decaying weight

g (u) = \exp (- {‖u‖}^{2}),

which ensures numerical stability and satisfies the regularity conditions in Theorem 8.

4.3. Estimators of the Mean

Let the observable bivariate series

(Y_{t})

of the 2D-GSB process be given by Equation (1). The unconditional mean vector is

μ ≔ E (Y_{t})

, and its natural estimator is the sample mean vector:

{\tilde{μ}}_{n} ≔ {\bar{Y}}_{n} = \frac{1}{n} \sum_{t = 1}^{n} Y_{t} .

(45)

Since

E (Y_{t}) = μ_{0} = {(m_{1}, m_{2})}^{⊤}

, the estimator

{\tilde{μ}}_{n}

is unbiased, i.e.,

E ({\tilde{μ}}_{n}) = μ_{0} .

Using the representation of

(Y_{t})

in terms of the innovations

(ε_{t})

, defined in Theorem 5, the estimator

{\tilde{μ}}_{n}

can be written as a sum of uncorrelated random vectors:

{\tilde{μ}}_{n} = μ_{0} + \frac{1}{n} [\sum_{k = 1}^{n} (1 + k q_{n - k}) ε_{n - k} + ε_{n}] .

This yields the covariance matrix

a r ({\tilde{μ}}_{n}) = \frac{1}{n^{2}} [\sum_{k = 1}^{n} E {(1 + k q_{n - k})}^{2} + 1] Σ = \frac{1}{n^{2}} [\sum_{k = 1}^{n} (1 + p_{c} k (k + 2)) + 1] Σ = \frac{1}{n^{2}} [n + 1 + p_{c} \frac{n (n + 1) (2 n + 7)}{6}] Σ = \frac{n + 1}{n^{2}} (1 + p_{c} \frac{n (2 n + 7)}{6}) Σ = \frac{p_{c} n}{3} Σ + O (n^{- 1}), n \to + \infty,

and implies that the variance of

{\tilde{μ}}_{n}

is unbounded.

Motivated by the time-dependent covariance structure of the 2D-GSB process, and similar to Jovanović et al. [12], we also introduce the weighted estimator

{\hat{μ}}_{n} ∶ = \frac{1}{n} \sum_{t = 1}^{n} {\bar{Y}}_{t} = \frac{1}{n} \sum_{t = 1}^{n} ω_{t} Y_{t}, ω_{t} ≔ H (n) - H (t - 1),

(46)

where

H (t) = \sum_{j = 1}^{t} j^{- 1}

denotes the harmonic numbers, with

H (0) = 0

. Clearly,

E ({\hat{μ}}_{n}) = μ_{0},

so the estimator

{\hat{μ}}_{n}

is also unbiased. Using an analogous decomposition into sums of uncorrelated random vectors (see, e.g., Jovanović et al. [12]), one obtains:

Var ({\hat{μ}}_{n}) = p_{c} H^{2} (n) Σ + O (H^{- 2} (n)), n \to + \infty .

Since

H (n) \sim \ln n,

as

n \to + \infty

, it follows that:

\frac{Var ({\hat{μ}}_{n})}{Var ({\tilde{μ}}_{n})} = O (\frac{\ln^{2} n}{n^{2}}) ⟶ 0, n \to + \infty .

Therefore, the weighted estimator

{\hat{μ}}_{n}

is asymptotically more efficient than the simple sample mean estimator

{\hat{μ}}_{n}

. This can also be seen in Figure 5, which shows 3D plots of both asymptotic variances, viewed as functions of the variables

p_{c} \in (0,1)

and

n > 0

. Note that the covariance matrix

Σ

is factored here, i.e., the surfaces represent the scalar part of the asymptotic variances.

5. Numerical Simulation and Application

Two important aspects related to the practical implementation of the 2D-GSB process are examined here. First, numerical Monte Carlo simulations of the basic series of the 2D-GSB model are carried out, for which the previously described estimators are calculated, and their efficiency is analyzed. Then, based on real data, the application of the 2D-GSB process is presented in the analysis of the dynamics and empirical distributions of the total number of different forms of criminal offenses in the Republic of Serbia.

5.1. Numerical Simulations of the 2D-GSB Estimates

This section describes the estimation of the parameters of the 2D-GSB model, based on

S = 300

independent Monte Carlo replications of the basic 2D-GSB series. In the first step, a series of innovations

ε_{t} = {(ε_{t}^{(1)}, ε_{t}^{(2)})}^{⊤}

are generated as independent and identically distributed vectors with two-dimensional normal distribution

N (0, Σ)

, and thereafter, the indicator series

q_{t} (c)

, defined as in Equation (4), is easily determined. According to this, the basic 2D-GSB series

(Y_{t})

is constructed, with the mean vector

μ_{0} = {(μ_{0}^{(1)}, μ_{0}^{(2)})}^{⊤},

as well as the bivariate increments

(X_{t})

, which are used for estimation of the unknown parameters

θ = {(p_{c}, σ_{1}^{2}, σ_{2}^{2}, ρ)}^{⊤} .

The numerical simulations are designed to examine the finite-sample behavior of the proposed estimators and to verify the theoretical results derived in Section 4. In particular, the Monte Carlo study focuses on the accuracy, stability, and asymptotic properties of the estimators under repeated sampling from the 2D-GSB process. In doing so, for the basic series

(Y_{t})

the mean vector

μ_{0} = {(μ_{0}^{(1)}, μ_{0}^{(2)})}^{⊤} = {(0,1)}^{⊤}

is taken, and for the threshold parameter the value

c = 4

is chosen, as well as for the covariance matrix:

Σ = [\begin{matrix} σ_{1}^{2} & ρ σ_{1} σ_{2} \\ ρ σ_{1} σ_{2} & σ_{2}^{2} \end{matrix}] = [\begin{matrix} 1 & 1 \\ 1 & 4 \end{matrix}] .

It is worth noting that then, according to the previous considerations (see Remark 3), the parameter

p_{c}

, defined by Equation (4), represents the survivor function of the mixture

χ_{2}^{2}

distribution, which does not have a closed form. Thus, it is estimated here by an additional Monte Carlo experiment with

N = 200,000

independent realizations. Note that using such extensive simulations, the estimated value of

p_{c}

is obtained with high accuracy, so it can be used as a reference value. In this way, as the true values of the parameters, the vector

θ_{0} = {(p_{c}, σ_{1}^{2}, σ_{2}^{2}, ρ)}^{⊤} = {(0.3967, 1, 4, 0.5)}^{⊤}

is obtained.

The estimates of the vector

μ_{0}

are calculated using estimators

{\tilde{μ}}_{n}

and

{\hat{μ}}_{n}

, defined by Equations (45) and (46), respectively. To this end, realizations of the series

(Y_{t})

of length

n = 150

are observed, and descriptive statistics (Min, Mean, Max), along with the appropriate estimation errors, i.e., bias, standard deviation (StDev) and root mean-squared error (RMSE), of the estimates thus obtained are shown in Table 1. As mentioned above, due to the non-stationarity of the series

(Y_{t})

, estimates of the mean vector

μ_{0}

have an unbounded asymptotic variance, so there is a large range of their observed values. Nevertheless, it is obvious that the estimator

{\hat{μ}}_{n}

is more efficient than

{\tilde{μ}}_{n}

, because its error statistics are significantly smaller. Furthermore, in order to investigate the asymptotic properties of the estimates thus obtained, they were also tested in relation to the AN property, and the results of these tests are also presented in Table 1. For this purpose, the following three statistical tests of normality were used:

-: Shapiro–Wilk normality test (SW);
-: Anderson–Darling normality test (AN);
-: Jarque–Bera normality test (JB).

Test statistics, as well as their corresponding

p

-values (listed in parentheses above), were calculated using procedures from the R-4.5.2 package “nortest” [33]. It is evident that both estimators

{\tilde{μ}}_{n}

and

{\hat{μ}}_{n}

have the AN property, even though they are obtained from the realization of a non-stationary series

(Y_{t})

. It should be noted that this is closely related to Theorems 4 and 5, which, among others, describe AN properties of scaled processes based on the observed GSB series

(Y_{t})

.

Further, the MoM estimates of the true parameter

θ_{0}

are simply obtained by using Equations (26)–(28), while the ECF estimates are calculated by minimizing the integral given by Equation (36). Hence, similarly as in Milovanović [34], the well-known Gauss-Hermite cubature are used, with the weight function

g (u_{1}, u_{2}) = \exp (- {‖u_{1}‖}^{2} - {‖u_{2}‖}^{2}),

u_{1}, u_{2} \in R^{2},

and 81 cubature nodes, where the entire procedure is obtained using the R-4.5.2 package “statmod” [35]. Thereafter, taking the previously obtained MoM estimates as initial values, the objective function given by Equation (36) is minimized using the constrained optimization procedure “L-BFGS-B” [36], also implemented in the statistical programming language “R”. Finally, in order to examine the efficiency, as well as other previously mentioned asymptotic properties of the estimates thus obtained, different series lengths

n \in \{150, 500, 1500\}

are considered. Their basic descriptive statistics, along with statistics and

p

-values of the aforementioned normality tests, are also calculated in the statistical software “R” and presented in the following Table 2, Table 3 and Table 4.

The results thus reported indicate that both estimation procedures perform satisfactorily even for moderate sample sizes. The empirical means of all estimated parameters are very close to their true values, while the corresponding biases remain small and mainly decrease as the sample size increases. A mild non-monotonic behavior of the finite-sample accuracy can be observed for the variance parameter

σ_{2}^{2}

, which is typical for mixture-type models. Additionally, within the 2D-GSB framework, variance parameters enter both the dispersion structure and the regime-selection probability

p_{c}

, which depends implicitly on the matrix

Σ

. Overall, as expected, the dispersion of the estimates measured through standard deviations and mean squared errors generally decreases as the sample size increases. For larger samples, including the longest series considered (

n = 1500

), the estimates exhibit small bias and moderate variability across all parameters, supporting their suitability for empirical applications based on longer time series.

Overall, the simulation results indicate that both estimation approaches perform satisfactorily across different sample sizes. The MoM procedure is computationally straightforward and particularly suitable for quick preliminary estimation or large datasets, due to its closed-form structure. In contrast, the ECF approach involves higher computational cost but exhibits stronger asymptotic efficiency properties. The results reported in Table 2, Table 3 and Table 4 suggest that the ECF estimator tends to achieve slightly lower dispersion and mean squared error in moderate samples, whereas MoM remains stable and practically convenient. This trade-off highlights the complementary roles of the two procedures in applied implementation. Also, note that in practical implementation, the threshold parameter

c

can be determined either via its theoretical one-to-one relationship with

p_{c}

or through a quantile-based calibration of the innovation norm. This ensures a transparent and data-driven specification of regime activation.

Finally, note that although the model is derived under Gaussian innovation assumptions, the mixture-based structure provides a degree of robustness to moderate deviations from normality, as reflected in the empirical application. As an illustration, Figure 6 and Figure 7 display the Q–Q plots of the empirical distributions of the estimated parameters against the corresponding Gaussian quantiles for

n = 1500

. The plots provide graphical support for the asymptotic normality of both estimation procedures and indicate a slightly improved finite-sample behavior of the ECF estimators. In this way, these graphical representations are consistent with the theoretical asymptotic results and with the variance and RMSE comparisons shown in Table 2, Table 3 and Table 4. In general, the Monte Carlo results confirm the above-mentioned theoretical properties of the 2D-GSB estimator and provide the possibility of their applicability in practical, multivariate time series analysis.

5.2. Application: A Case Study of Crime Dynamics

After determining the properties of the proposed estimators over finite samples through Monte Carlo simulations, we consider here the empirical application of the 2D-GSB framework. To illustrate the practical performance of the proposed model, we apply it to real-world multivariate time series representing the total number of specific criminal offenses committed on the territory of the Republic of Serbia. The data were obtained based on official records of the Ministry of Internal Affairs of the Republic of Serbia, which are monitored daily, starting from 1 January 2015 and ending with 31 December 2024, which resulted in a time series length of

n = 3653

. It should be noted that each of the observed series is obtained and classified according to the official Criminal Code of the Republic of Serbia (code KD_xxx), where bivariate series contain data on related criminal activities as their components. In this way, two bivariate series are observed, designated as Series A and Series B, whose components are the following:

A₁: Petty theft (code KD_203).

A₂: Aggravated theft and robbery (code KD_204).

B₁: Counterfeiting money, securities, counterfeiting and misuse of payment cards (codes KD_241-244).

B₂: Document falsification and other special cases of document falsification (codes KD_355-357).

The dynamics of both bivariate time series are illustrated in Figure 8, where the pronounced fluctuations, i.e., sudden “jumps” in the number of committed criminal acts, are clearly visible. At the same time, the intercorrelation between the components of both bivariate series is noticeable even at first glance. Therefore, use of a synchronized threshold mechanism is particularly suitable in this context, as external shocks (e.g., policy changes, economic disturbances, or enforcement actions) may simultaneously affect related crime categories. The common regime indicator thus provides a natural interpretation of coordinated spikes and structural shifts observed in the data. Note that although the observed series represent daily crime counts, the proposed 2D-GSB model is not intended to directly model the count-valued observation space, but rather to capture the underlying common dynamics of pronounced fluctuations and synchronized regime changes.

In this context, the descriptive statistics reported in Table 5 reveal a pronounced overdispersion of both series (especially Series A), as well as extremely heavy-tailed behavior, reflected in very high kurtosis and skewness in Series B. In addition, the average value of document forgeries (component B₂) is approximately 7.3 per day, but the range varies from as few as 0 to as many as 304 such crimes per day. Along with the significant cross-dependence between their components, these features motivate the use of a latent regime-based framework rather than standard count-based models. Also, in contrast to univariate approaches, the two-dimensional GSB framework enables the joint modeling of related crime categories while explicitly accounting for cross-dependence in their extreme dynamics.

Since the original series represent crime counts and exhibit pronounced heteroscedasticity and skewness, a logarithmic transformation (“log-volume”) is applied prior to modeling. This transformation not only stabilizes the variance and reduces asymmetry but also facilitates a closer approximation of the increment process by a Gaussian mixture distribution, as assumed in the 2D-GSB framework. Consequently, the transformed increment series is more consistent with the underlying distributional structure of the model. For these reasons, as basic bivariate series

(Y_{A, t})

and

(Y_{B, t})

, the realizations of the so-called log-volumes, i.e., logarithmic values of series A and B, are observed as follows:

Y_{A, t}^{(i)} : = \ln (A_{i, t}), Y_{B, t}^{(i)} : = \ln (1 + B_{i, t}), i = 0, 1 .

(47)

As is stated in [37,38], the main goal of these transformations is to more evenly obtain values of both series, while based on increasing of the logarithmic function, the emphasis of fluctuations will remain. Additionally, note that, unlike the series

Y_{A, t} = {(Y_{A, t}^{(1)}, Y_{A, t}^{(2)})}^{⊤}

, which represents the usual log-transformation, the series

Y_{B, t} = {(Y_{B, t}^{(1)}, Y_{B; t}^{(2)})}^{⊤}

is a so-called shifted log-transformation, as a consequence of the equality

\min (B_{1}) = \min (B_{2}) = 0

. In this way, from inequalities

A \geq 1

and

B \geq 0

, it follows that both series of log-volumes are non-negative

(Y_{A, t}, Y_{B, t} \geq 0)

.

Further, using the log-volumes as a basic bivariate series, the location parameter

μ_{0}

for both series is estimated, following the procedure described in Section 4.3. In more detail, the

μ

-estimates are obtained according to Equations (45) and (46), which correspond, respectively, to the sample and weighted mean values of the bivariate series

(Y_{A, t}), (Y_{B, t})

. Using Equations (11) and (12), the increment series

(X_{A, t})

and

(X_{B, t})

are then constructed. Based on these series, the remaining parameters collected in the vector

θ

are estimated, including the probability of exceeding the threshold and the elements of the covariance matrix. To this end, the procedures presented in Section 4.1 and Section 4.2 are applied, namely the method of moments (MoM) and empirical characteristic functions (ECF) method, thus ensuring consistency with the theoretical framework developed previously.

The resulting estimates reported in Table 6 demonstrate stability and interpretability, thereby enabling further analysis of different crime categories. In particular, the series-specific

μ

-estimates reflect systematic differences in the average growth rates of the corresponding crime categories. At the same time, the estimated

θ

-parameters suggest substantial variability and cross-dependence in the increment dynamics, justifying the use of a multivariate threshold-based model. Note that the magnitudes of the estimated parameters remain stable and interpretable across estimation methods, supporting the adequacy of the proposed inference framework. The modest differences between the MoM and ECF estimates remain within a comparable range and do not change the overall structural interpretation.

From an interpretative perspective, the estimated parameters provide additional insight into the structural dynamics of the analyzed crime categories. For Series A (petty theft and aggravated theft/robbery), the estimated probability

{\hat{p}}_{c} \approx 0.13

suggests that coordinated shock-activated episodes occur in roughly 13% of observations, indicating recurrent but not dominant structural fluctuations affecting both offense types simultaneously. In contrast, for Series B (counterfeiting-related offenses and document falsification), the estimated probabilities range between 0.13 and 0.16, implying a comparable frequency of coordinated disturbances. However, the substantially higher estimated threshold

\hat{c}

(approximately 1.3–1.5, compared to 0.23–0.26 in Series A) indicates that more pronounced innovation magnitudes are required to trigger regime activation in financial and document-related crimes. This suggests that while synchronized shifts occur with similar frequency across both crime groups, the intensity of shocks necessary to activate such shifts differs, reflecting potentially distinct structural sensitivity patterns within the two categories.

To further assess the empirical adequacy of the proposed model, we compare the 2D-GSB specification with a standard first-order vector auto-regression (VAR(1)) benchmark estimated on the stationary increment series

(X_{A, t})

and

(X_{B, t})

. Table 7 reports the corresponding log-likelihood (LogLik), Akaike and Bayesian information criteria (AIC and BIC), together with the joint root mean square error (RMSE) for both models and both series. The joint RMSE is defined as the square root of the arithmetic mean of the component-wise mean squared deviations between empirical and fitted densities, thereby providing a single aggregate measure of distributional fit. As can be seen, the 2D-GSB model achieves higher log-likelihood values and lower information criteria and discrepancy measures than the standard VAR(1) specification. It is also worth noting that the proposed 2D-GSB framework involves only four parameters, compared to seven in the VAR model. These results indicate a more parsimonious yet substantially improved distributional fit of the increment process, particularly in the case of Series A.

In addition, Figure 9 presents the empirical marginal distributions of the bivariate increments together with the Gaussian fit implied by the VAR(1) specification and the Gaussian mixture of the increments implied by the 2D-GSB model introduced in Section 3. The VAR(1) estimation is carried out using the R-4.5.2 package “vars” [39], while the 2D-GSB parameters are obtained via the MoM procedure for Series A and the ECF method for Series B, as described previously. As illustrated in the figure, the mixture representation underlying the increments of the 2D-GSB model provides a closer alignment with the empirical distributions, particularly in capturing increased dispersion and heavier-tail behavior. In contrast, the single-Gaussian structure of the VAR(1) model tends to underestimate the probability of extreme observations in most cases. Overall, the visual agreement between the empirical histograms and the fitted mixture densities further supports the adequacy of the 2D-GSB framework for modeling synchronized regime dynamics in the observed data.

Further, fitting of the empirical distributions of the underlying Series A and B is carried out. Due to the distinct transformations in Equation (47), the implied distributions of these series differ. While Series A follows a mixture of bivariate log-normal distributions, Series B is characterized by a mixture of shifted bivariate log-normal distributions. In both cases, the Jacobian of the inverse transformation plays a crucial role, ensuring a proper mapping from the latent Gaussian mixture to the observable crime counts. Thus, the fitted distributions of the original crime series are obtained by an explicit change-of-variables procedure based on the transformations defined in Equation (47).

Let

Y_{t} \in \{Y_{A, t}, Y_{B, t}\}

denote either of the latent processes given by Equation (47). From the theoretical results in Section 3,

Y_{t}

follows a discrete mixture of bivariate Gaussian distributions

f_{Y_{t}} (y) = \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} (1 - p_{c})^{t - k} ϕ (y; μ_{0}, (k + 1) Σ),

where

ϕ (y; μ_{0}, (k + 1) Σ)

is the PDF of the bivariate Gaussian distribution. Thus, for Series A, the inverse transformation

A_{t} = \exp (Y_{A, t})

yields the density

f_{A_{t}} (a) = f_{Y_{A, t}} (\ln a) |\det \frac{\partial \ln a}{\partial a}| = \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} (1 - p_{c})^{t - k} \frac{ϕ (\ln a; μ_{A}, (k + 1) Σ_{A})}{a_{1} a_{2}},

where

a = {(a_{1}, a_{2})}^{⊤} > 0 .

Similarly, for Series B, using the inverse map

B_{t} = e x p (Y_{B, t}) - 1

, the resulting density is given by

f_{B_{t}} (b) = f_{Y_{B, t}} (l n (b + 1)) |\det \frac{\partial \ln (b + 1)}{\partial b}| = \sum_{k = 0}^{t} (\binom{t}{k}) p_{c}^{k} (1 - p_{c})^{t - k} \frac{ϕ (\ln (b + 1); μ_{B}, (k + 1) Σ_{B})}{(1 + b_{1}) (1 + b_{2})},

where

b = {(b_{1}, b_{2})}^{⊤} \geq 0 .

Thus, the proposed framework induces a mixture of bivariate log-normal distributions for Series A and a mixture of shifted bivariate log-normal distributions for Series B, providing a link between the latent 2D-GSB dynamics and empirical distributions of observed crime counts. Nevertheless, it is worth noting that due to the non-stationarity of the mentioned series, which also depend on time

(t)

, it is necessary to apply some numerical procedures to calculate their PDFs. For this purpose, the R-4.5.2 package “distr” [40] is used, and the results of the applied procedure are shown in Figure 10.

As illustrated in Figure 10, the empirical distributions of the original crime counts are shown together with the fitted theoretical densities obtained via the inverse-log mixture representations implied by the 2D-GSB model. For Series A (theft-related offenses), the fitted log-normal mixtures capture both the central mass and the pronounced right tails of the distributions, indicating that the proposed model adequately reflects the observed variability and intermittency. Similarly, for Series B (counterfeiting-related offenses), the shifted log-normal mixtures provide a satisfactory approximation of the highly skewed empirical distributions, particularly in the lower-count region and the gradual tail decay. Overall, the agreement between empirical histograms and fitted densities confirms that the mixture structure derived from the latent 2D-GSB dynamics translates effectively to the level of observed criminal activity. In particular, extreme crime counts are naturally explained as realizations generated under the high-variance regime, without the need for additional ad hoc distributional assumptions.

6. Conclusions

This paper develops a two-dimensional GSB framework for modeling multivariate stochastic processes characterized by intermittent regime switches and pronounced cross-dependence. The proposed model extends the univariate GSB construction by introducing joint threshold-driven dynamics, which leads to a tractable mixture representation of the increment distribution. Theoretical results establish key distributional properties, including explicit forms of characteristic functions that characterize the distributions of the principal components of the 2D-GSB process and clearly distinguish between its stationary and non-stationary components. A unified estimation strategy combining moment-based and characteristic-function-based methods is introduced and shown to perform well in finite samples.

The empirical application to crime-related time series illustrates how both stationary and non-stationary features of the latent dynamics translate into realistic distributional characteristics of the observed data, including skewness, heavy tails, and joint variability. Overall, the results demonstrate that the proposed 2D-GSB framework provides a flexible and practically relevant representation of synchronized regime dynamics in correlated time series. In particular, the likelihood-based comparison with a VAR(1) benchmark indicates a substantially improved distributional fit, while preserving a parsimonious parameter structure. These findings, together with the theoretical results established in this paper, support the potential applicability of the 2D-GSB process in modeling multivariate time series characterized by regime-dependent behavior and heavy-tailed features.

Finally, the presented results open several directions for further research on related bivariate models. In addition to the Laplacian and Cauchy extensions already considered in the univariate setting, the proposed framework could be further generalized by allowing for elliptical or stable innovation distributions. Such extensions would naturally preserve the latent regime-switching structure, while enabling the modeling of heavier tails and more flexible dependence patterns.

Author Contributions

Conceptualization, S.S.; methodology, V.S.S. and M.J.; software, V.S.S. and M.J.; validation, S.S., V.S.S. and M.J.; formal analysis, S.S., V.S.S. and D.J.; investigation, R.R.; resources, R.R.; data curation, S.S. and V.S.S.; writing—original draft preparation, S.S., V.S.S. and M.J.; writing—review and editing, S.S., M.J. and D.J.; visualization, D.J. and R.R.; supervision, D.J. and R.R.; project administration, S.S. and M.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are official data from the Ministry of Internal Affairs of the Republic of Serbia and are available upon request from the corresponding author.

Acknowledgments

The authors sincerely thank the Ministry of Internal Affairs of the Republic of Serbia, who officially provided the dataset presented in this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Engle, R.F.; Smith, A.D. Stochastic Permanent Breaks. Rev. Econ. Stat. 1999, 81, 553–574. [Google Scholar] [CrossRef]
Huang, B.-N.; Fok, R.C.W. Stock Market Integration—An Application of the Stochastic Permanent Breaks Model. Appl. Econ. Lett. 2001, 8, 725–729. [Google Scholar] [CrossRef]
Gonzalo, J.; Martínez, O. Large Shocks vs. Small Shocks. (Or does size matter? May be so.). J. Econom. 2006, 135, 311–347. [Google Scholar] [CrossRef]
Bisaglia, L.; Gerolimetto, M. Forecasting long memory time series when occasional breaks occur. Econ. Lett. 2008, 98, 253–258. [Google Scholar] [CrossRef]
Bisaglia, L.; Gerolimetto, M. An empirical strategy to detect spurious effects in long memory and occasional-break processes. Commun. Stat. Simul. Comput. 2008, 38, 172–189. [Google Scholar] [CrossRef][Green Version]
Kapetanios, G.; Tzavalis, E. Modeling Structural Breaks in Economic Relationships Using Large Shocks. J. Econom. Dynam. Control 2010, 34, 417–436. [Google Scholar] [CrossRef]
Dendramis, Y.; Kapetanios, G.; Tzavalis, E. Level Shifts in Stock Returns Driven by Large Shocks. J. Empir. Financ. 2014, 29, 41–51. [Google Scholar] [CrossRef]
Dendramis, Y.; Kapetanios, G.; Tzavalis, E. Shifts in Volatility Driven by Large Stock Market Shocks. J. Econom. Dynam. Control 2015, 55, 130–147. [Google Scholar] [CrossRef]
Granero-Belinchón, C.; Roux, S.G.; Garnier, N.B. Information Theory for Non-Stationary Processes with Stationary Increments. Entropy 2019, 21, 1223. [Google Scholar] [CrossRef]
Rebei, N.; Sbia, R. Transitory and Permanent Shocks in the Global Market for Crude Oil. J. Appl. Econom. 2021, 36, 1047–1064. [Google Scholar] [CrossRef]
Stojanović, V.; Popović, B.Č.; Popović, P. Model of General Split-BREAK Process. REVSTAT– Stat. J. 2015, 13, 145–168. [Google Scholar]
Jovanović, M.; Stojanović, V.; Kuk, K.; Popović, B.; Čisar, P. Asymptotic Properties and Application of GSB Process: A Case Study of the COVID-19 Dynamics in Serbia. Mathematics 2022, 10, 3849. [Google Scholar] [CrossRef]
Stojanović, V.; Milovanović, G.V.; Jelić, G. Distributional Properties and Parameters Estimation of GSB Process: An Approach Based on Characteristic Functions. ALEA—Lat. Am. J. Probab. Math. Stat. 2016, 13, 835–861. [Google Scholar] [CrossRef]
Stojanović, V.S.; Bakouch, H.S.; Ljajko, E.; Božović, I. Laplacian Split-BREAK Process with Application in Dynamic Analysis of the World Oil and Gas Market. Axioms 2023, 12, 622. [Google Scholar] [CrossRef]
Ljajko, E.; Stojanović, V.S.; Tošić, M.; Božović, I. Cauchy Split-BREAK Process: Asymptotic Properties and Application in Securities Market Analysis. UPB Sci. Bull. Ser. A Appl. Math. Phys. 2023, 85, 139–154. [Google Scholar]
Kole, E.; van Dijk, D. Moments, Shocks and Spillovers in Markov-switching VAR Models. J. Econom. 2023, 236, 105474. [Google Scholar] [CrossRef]
Tan, Z.; Wu, Y. On Regime Switching Models. Mathematics 2025, 13, 1128. [Google Scholar] [CrossRef]
El Ghaoui, L.; Tsai, A.Y.; Calafiore, G.C. Linear Algebra and Applications; VinUniversity Pressbooks: Hanoi, Vietnam, 2023. [Google Scholar]
Williams, D. Probability with Martingales; Cambridge University Press: Cambridge, UK, 1991. [Google Scholar]
Billingsley, P. Convergence of Probability Measures; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1999. [Google Scholar]
Gao, X.; Sitharam, M.; Roitberg, A. Bounds on the Jensen Gap, and Implications for Mean-Concentrated Distributions. Aust. J. Math. Anal. Appl. 2019, 16, 1–16. [Google Scholar]
Sen, P.K.; Singer, J.M. Large Sample Methods in Statistics: An Introduction with Applications (Reprint); Chapman & Hall/CRC: Boca Raton, FL, USA, 2000. [Google Scholar]
Hoeffding, W.; Robbins, H. The Central Limit Theorem for Dependent Random Variables. Duke Math. J. 1948, 15, 773–780. [Google Scholar] [CrossRef]
Cuesta-Albertos, J.A.; Fraiman, R.; Ransford, T. A Sharp Form of the Cramér–Wold Theorem. J. Theor. Probab. 2007, 20, 201–209. [Google Scholar] [CrossRef]
Serfling, R.J. Approximation Theorems of Mathematical Statistics, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2002. [Google Scholar]
Knight, J.L.; Yu, J. Empirical Characteristic Function in Time Series Estimation. Econom. Theory 2002, 18, 691–721. [Google Scholar] [CrossRef]
Yu, J. Empirical Characteristic Function Estimation and Its Applications. Econom. Rev. 2004, 23, 93–123. [Google Scholar] [CrossRef]
Kotchoni, R. Applications of the Characteristic Function-Based Continuum GMM in Finance. Comput. Stat. Data Anal. 2012, 56, 3599–3622. [Google Scholar] [CrossRef]
Meintanis, S.G.; Swanepoel, J.; Allison, J. The Probability Weighted Characteristic Function and Goodness-of-Fit Testing. J. Stat. Plan. Infer. 2014, 146, 122–132. [Google Scholar] [CrossRef]
Meintanis, S.G. A Review of Testing Procedures Based on the Empirical Characteristic Function. S. Afr.Statist. J. 2016, 50, 1–14. [Google Scholar] [CrossRef]
Gu, W.; Zhang, L. Strong Law of Large Numbers for m-dependent and Stationary Random Variables under Sub-linear Expectations. Sci. Sinica Math. 2026, 56, 73. [Google Scholar] [CrossRef]
Newey, W.K.; McFadden, D. Large Sample Estimation and Hypothesis Testing. Handb. Econom. 1994, 4, 2111–2245. [Google Scholar]
Gross, J.; Ligges, U. Nortest: Tests for Normality. R Package Version 1.0-4, 2015. Available online: http://CRAN.R-project.org/package=nortest (accessed on 3 January 2026).
Milovanović, G.V. Construction and Applications of Gaussian Quadratures with Nonclassical and Exotic Weight Functions. Stud. Univ. Babes-Bolyai Math. 2015, 60, 211–233. [Google Scholar]
Giner, G.; Smyth, G.K. Statmod: Probability Calculations for the Inverse Gaussian Distribution. arXiv 2016, arXiv:1603.06687. [Google Scholar] [CrossRef]
Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, Z. A Limited Memory Algorithm for Bound Constrained Optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
So, M.K.; Chen, C.W.; Chiang, T.C.; Lin, D.S. Modelling Financial Time Series with Threshold Nonlinearity in Returns and Trading Volume. Appl. Stoch. Models Bus. Ind. 2007, 23, 319–338. [Google Scholar] [CrossRef]
Enow, S.T. Modelling Financial Time Series with Threshold Nonlinearity. Int. J. Res. Bus. Soc. Sci. 2025, 14, 152–156. [Google Scholar] [CrossRef]
Pfaff, B. VAR, SVAR and SVEC Models: Implementation Within R Package vars. J. Stat. Soft. 2008, 27, 1–32. Available online: https://www.jstatsoft.org/v27/i04/ (accessed on 6 February 2026). [CrossRef]
Ruckdeschel, P.; Kohl, M.; Stabla, T.; Camphausen, F. S4 Classes for Distributions. R News 2006, 6, 2–6. Available online: https://CRAN.R-project.org/doc/Rnews (accessed on 11 January 2026).

Figure 1. Panels above: Dynamics of non-stationary components of the 2D-GSB process. Other panels: Autocorrelation functions of the non-stationary components (Parameter values are:

μ = 0, σ_{1}^{2} = σ_{1}^{2} = 1, ρ = 0.6

and

c \approx 3.8

.).

Figure 1. Panels above: Dynamics of non-stationary components of the 2D-GSB process. Other panels: Autocorrelation functions of the non-stationary components (Parameter values are:

μ = 0, σ_{1}^{2} = σ_{1}^{2} = 1, ρ = 0.6

and

c \approx 3.8

.).

Figure 2. Comparison of PDF (a) and CF moduli (b) of weighted and standard

χ_{2}^{2}

distribution. (Parameter values are:

σ_{1} = 1, σ_{2} = 2

and

ρ = 0.8

, which imply

λ_{1} \approx 0.307, λ_{2} \approx 4.693

).

Figure 2. Comparison of PDF (a) and CF moduli (b) of weighted and standard

χ_{2}^{2}

distribution. (Parameter values are:

σ_{1} = 1, σ_{2} = 2

and

ρ = 0.8

, which imply

λ_{1} \approx 0.307, λ_{2} \approx 4.693

).

Figure 3. Modulus convergence of CFs for bivariate RVs

t^{- 1 / 2} μ_{t}

(a) and

t^{- 1 / 2} Y_{t}

(b), when

t = 1, 2, \dots, 50

(Parameter values are:

p_{c} = 0.4, σ_{1} = σ_{2} = 1

and

ρ = 0.6

).

Figure 3. Modulus convergence of CFs for bivariate RVs

t^{- 1 / 2} μ_{t}

(a) and

t^{- 1 / 2} Y_{t}

(b), when

t = 1, 2, \dots, 50

(Parameter values are:

p_{c} = 0.4, σ_{1} = σ_{2} = 1

and

ρ = 0.6

).

Figure 4. Moduli of CFs for the bivariate RVs

M_{t; α}

(a) and

U_{t; α}

(b), with different values of

α

and

t = 50

(Parameters values are the same as in Figure 3).

Figure 4. Moduli of CFs for the bivariate RVs

M_{t; α}

(a) and

U_{t; α}

(b), with different values of

α

and

t = 50

(Parameters values are the same as in Figure 3).

Figure 5. 3D plots of asymptotic variances of the estimate

{\tilde{μ}}_{n}

(a) and estimate

{\hat{μ}}_{n}

(b), depending on parameter

p_{c} \in (0,1)

and sample size

n > 0

.

Figure 5. 3D plots of asymptotic variances of the estimate

{\tilde{μ}}_{n}

(a) and estimate

{\hat{μ}}_{n}

(b), depending on parameter

p_{c} \in (0,1)

and sample size

n > 0

.

Figure 6. Q–Q plots of the empirical distributions of the MoM estimates

{\hat{p}}_{c}

,

{\hat{σ}}_{1}^{2}

,

{\hat{σ}}_{2}^{2}

, and

\hat{ρ}

, obtained from

S = 300

Monte Carlo replications of the 2D-GSB process with sample size

n = 1500

.

Figure 6. Q–Q plots of the empirical distributions of the MoM estimates

{\hat{p}}_{c}

,

{\hat{σ}}_{1}^{2}

,

{\hat{σ}}_{2}^{2}

, and

\hat{ρ}

, obtained from

S = 300

Monte Carlo replications of the 2D-GSB process with sample size

n = 1500

.

Figure 7. Q–Q plots of the empirical distributions of the ECF estimates

{\hat{p}}_{c}

,

{\hat{σ}}_{1}^{2}

,

{\hat{σ}}_{2}^{2}

, and

\hat{ρ}

, based on

S = 300

Monte Carlo replications with sample size

n = 1500

.

Figure 7. Q–Q plots of the empirical distributions of the ECF estimates

{\hat{p}}_{c}

,

{\hat{σ}}_{1}^{2}

,

{\hat{σ}}_{2}^{2}

, and

\hat{ρ}

, based on

S = 300

Monte Carlo replications with sample size

n = 1500

.

Figure 8. Dynamics of the total number of two types of theft (a) and forgery (b) on the territory of the Republic of Serbia.

Figure 9. Empirical distributions of the stationary series of increments (histograms) along with their fitted PDFs obtained using VAR(1) and 2D-GSB model (lines): Series A—plots above; Series B—plots below.

Figure 10. Empirical distributions of crime dynamics data (histograms) and their fitted PDFs (lines), obtained by the proposed estimation procedure: Series A—plots above; Series B—plots below.

Table 1. Descriptive statistics and AN testing results of the mean value estimates. (Series length is

n = 150

, and the true parameter values are

μ_{0}^{(1)} = 0, μ_{0}^{(2)} = 1

).

Table 1. Descriptive statistics and AN testing results of the mean value estimates. (Series length is

n = 150

, and the true parameter values are

μ_{0}^{(1)} = 0, μ_{0}^{(2)} = 1

).

Statistics	$Sample Mean Estimator ({\tilde{μ}}_{n})$		$Weighted Estimator ({\hat{μ}}_{n})$
Statistics	${\tilde{μ}}_{n}^{(1)}$	${\tilde{μ}}_{n}^{(2)}$	${\hat{μ}}_{n}^{(1)}$	${\hat{μ}}_{n}^{(2)}$
Min	−12.160	−22.467	−7.9302	−14.305
Mean	−0.0756	0.7713	−0.0569	0.9442
Max	16.408	27.833	10.318	16.982
Bias	−0.0756	−0.2287	−0.0569	−0.0558
StDev	4.3383	8.9660	2.7328	5.4760
RMSE	4.3317	8.9539	2.7289	5.4672
SW	0.9967	0.9970	0.9940	0.9971
( $p$ -value)	(0.8003)	(0.8571)	(0.2827)	(0.8695)
AD	0.1182	0.2874	0.3676	0.3367
( $p$ -value)	(0.9899)	(0.6182)	(0.4289)	(0.5042)
JB	1.2528	0.3156	2.8775	0.1155
( $p$ -value)	(0.5345)	(0.8540)	(0.2372)	(0.9439)

Table 2. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 150

, and the true parameter values are

p_{c} \approx 0.3967

,

σ_{1}^{2} = 1

,

σ_{1}^{2} = 4, ρ = 0.5

).

Table 2. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 150

, and the true parameter values are

p_{c} \approx 0.3967

,

σ_{1}^{2} = 1

,

σ_{1}^{2} = 4, ρ = 0.5

).

Statistics	MoM				ECF
Statistics	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$
Min	0.0000	0.6768	2.7087	0.2738	0.0944	0.5628	2.8580	0.2820
Mean	0.3846	0.9845	4.0192	0.4973	0.3872	0.9987	4.0186	0.4982
Max	0.7135	1.3562	6.3611	0.6745	0.8229	1.4708	5.6043	0.7675
Bias	−0.0121	−0.0155	0.0192	−0.0027	−0.0095	−0.0013	0.0186	−0.0018
StDev	0.1407	0.1381	0.6353	0.0785	0.1579	0.1618	0.5839	0.0862
MSE	0.1407	0.1384	0.6334	0.0784	0.1580	0.1613	0.5823	0.0860
SW	0.986 **	0.995	0.985 **	0.994	0.980 *	0.996	0.984 **	0.992
( $p$ -value)	(0.005)	(0.376)	(0.004)	(0.329)	(0.026)	(0.735)	(0.002)	(0.083)
AD	0.850 *	0.446	0.753 *	0.261	0.933 *	0.333	0.883 *	0.602
( $p$ -value)	(0.028)	(0.281)	(0.049)	(0.706)	(0.018)	(0.509)	(0.024)	(0.117)
JB	6.502 *	2.066	8.054 *	2.502	4.272	0.556	12.676 **	4.816
( $p$ -value)	(0.039)	(0.356)	(0.018)	(0.286)	(0.118)	(0.757)	(0.002)	(0.090)

*

p < 0.05

, **

p < 0.01 .

Table 3. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 500

, and the true parameter values are the same as in Table 2).

Table 3. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 500

, and the true parameter values are the same as in Table 2).

Statistics	MoM				ECF
Statistics	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$
Min	0. 0794	0.7787	3.2927	0.3459	0.1409	0.8423	3.3609	0.3657
Mean	0.3984	1.0102	3.9947	0.4959	0.3953	1.0056	3.9953	0.4988
Max	0.6012	1.2533	4.8995	0.6066	0.5941	1.2592	4.8769	0.6025
Bias	0.0017	0.0102	−0.0053	−0.0041	−0.0014	0.0056	−0.0047	−0.0012
StDev	0.0722	0.1018	0.3350	0.0464	0.0884	0.0876	0.2995	0.0438
MSE	0.0720	0.1019	0.3339	0.0466	0.0884	0.0875	0.2986	0.0437
SW	0.978 *	0.991	0.984	0.991	0.991	0.997	0.988	0.992
( $p$ -value)	(0.016)	(0.478)	(0.074)	(0.497)	(0.505)	(0.983)	(0.214)	(0.536)
AD	0.699	0.456	0.468	0.462	0.526	0.186	0.375	0.452
( $p$ -value)	(0.067)	(0.263)	(0.247)	(0.254)	(0.178)	(0.904)	(0.410)	(0.270)
JB	5.984	1.186	2.454	1.632	0.433	0.313	2.271	2.047
( $p$ -value)	(0.050)	(0.553)	(0.293)	(0.442)	(0.805)	(0.855)	(0.321)	(0.359)

*

p < 0.05 .

Table 4. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 1500

, and the true parameter values are the same as in Table 2 and Table 3).

Table 4. Descriptive statistics and AN testing results of estimated parameter values. (Series length is

n = 1500

, and the true parameter values are the same as in Table 2 and Table 3).

Statistics	MoM				ECF
Statistics	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$	$p_{c}$	$σ_{1}^{2}$	$σ_{2}^{2}$	$ρ$
Min	0.2588	0.8597	3.5807	0.4272	0.2806	0.8698	3.6226	0.4376
Mean	0.3984	1.0051	4.0355	0.5010	0.3969	1.0047	4.0264	0.5006
Max	0.4968	1.1618	4.5975	0.5644	0.5454	1. 1595	4.5100	0.5575
Bias	0.0017	0.0051	0.0355	0.0010	0.0002	0.0047	0.0264	0.0006
StDev	0.0471	0.0529	0.1983	0.0240	0.0418	0.0504	0.1775	0.0235
MSE	0.0469	0.0529	0.2008	0.0239	0.0418	0.0504	0.1789	0.0234
SW	0.997	0.992	0.994	0.997	0.994	0.995	0.994	0.997
( $p$ -value)	(0.784)	(0.111)	(0.326)	(0.854)	(0.224)	(0.359)	(0.335)	(0.755)
AD	0.315	0.627	0.415	0.215	0.722	0.445	0.404	0.329
( $p$ -value)	(0.542)	(0.101)	(0.333)	(0.846)	(0.059)	(0.282)	(0.354)	(0.514)
JB	0.476	5.018	2.232	0.469	0.549	2.434	2.823	0.300
( $p$ -value)	(0.788)	(0.081)	(0.328)	(0.791)	(0.760)	(0.296)	(0.244)	(0.861)

Table 5. Basic statistical indicators of observed real-world data.

Statistics	A₁	A₂	B₁	B₂
Min	3	4	0	0
Mean	47.13	35.14	1.677	7.306
Median	46	31	1	6
Mode	44	23	0	6
Max	152	162	87	304
Variance	205.1	289.7	8.049	88.79
Kurtosis	5.634	3.109	268.0	399.8
Skewness	1.366	1.319	11.37	16.14
Range	149	158	87	304
Correlation	0.6566		0.3391

Table 6. Estimated parameter values of the 2D-GSB process.

$μ$ —Estimates	$Y_{A, t}^{(1)}$	$Y_{A, t}^{(2)}$	$Y_{B, t}^{(1)}$	$Y_{B, t}^{(2)}$
${\tilde{μ}}_{n}$	3.808	3.450	0.7368	1.903
${\hat{μ}}_{n}$	3.946	3.789	0.6741	1.949
$θ$ —Estimates	MoM	ECF	MoM	ECF
${\hat{p}}_{c}$	0.1290	0.1296	0.1345	0.1626
$\hat{c}$	0.2574	0.2320	1.467	1.322
${\hat{σ}}_{1}^{2}$	0.0566	0.0589	0.3875	0.3445
${\hat{σ}}_{2}^{2}$	0.0680	0.0857	0.8426	0.8357
$\hat{ρ}$	0.3698	0.3618	0.1478	0.1707

Table 7. Goodness-of-fit statistics of VAR(1) and 2D-GSB benchmarks.

Statistics	Series A		Series B
Statistics	VAR(1)	2D-GSB	VAR(1)	2D-GSB
LogLik	−1339.1	−1154.9	−8031.5	−7863.4
AIC	2692.2	2317.8	16,077.1	15,734.7
BIC	2735.6	2342.6	16,120.5	15,759.5
RMSE	0.0105	7.56 $\times 10^{- 3}$	0.0237	0.0209

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Stojičić, S.; Stojanović, V.S.; Jovanović, M.; Joksimović, D.; Radovanović, R. Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics. Mathematics 2026, 14, 754. https://doi.org/10.3390/math14050754

AMA Style

Stojičić S, Stojanović VS, Jovanović M, Joksimović D, Radovanović R. Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics. Mathematics. 2026; 14(5):754. https://doi.org/10.3390/math14050754

Chicago/Turabian Style

Stojičić, Snežana, Vladica S. Stojanović, Mihailo Jovanović, Dušan Joksimović, and Radovan Radovanović. 2026. "Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics" Mathematics 14, no. 5: 754. https://doi.org/10.3390/math14050754

APA Style

Stojičić, S., Stojanović, V. S., Jovanović, M., Joksimović, D., & Radovanović, R. (2026). Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics. Mathematics, 14(5), 754. https://doi.org/10.3390/math14050754

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bivariate Generalized Split-BREAK Process with Application in Modeling Crime Dynamics

Abstract

1. Introduction

2. Definition and Structure of the 2D-GSB Process

3. Distributional Properties

4. Parameters Estimation

4.1. Moment-Based Estimators

4.2. ECF Estimators

4.3. Estimators of the Mean

5. Numerical Simulation and Application

5.1. Numerical Simulations of the 2D-GSB Estimates

5.2. Application: A Case Study of Crime Dynamics

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI