Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters

Muia, Mathias Nthiani; Atutey, Olivia; Abeykoon, Chathurika Srimali

doi:10.3390/math14050914

Open AccessArticle

Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters

by

Mathias Nthiani Muia

^1,*

,

Olivia Atutey

¹

and

Chathurika Srimali Abeykoon

²

¹

Department of Mathematics and Statistics, University of South Alabama, Mobile, AL 36688, USA

²

Department of Mathematics and Statistics, Rhodes College, Memphis, TN 38112, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2026, 14(5), 914; https://doi.org/10.3390/math14050914

Submission received: 9 February 2026 / Revised: 4 March 2026 / Accepted: 6 March 2026 / Published: 8 March 2026

(This article belongs to the Special Issue Advances in Probability Theory and Stochastic Analysis)

Download

Browse Figures

Versions Notes

Abstract

Local likelihood methods are widely used to estimate calibration functions in conditional copula models. Recent work has established uniform stochastic equicontinuity and uniform convergence rates for local likelihood estimators of covariate-dependent copula parameters, yielding global consistency guarantees and supporting the stability of local optimization routines. This paper complements those results by deriving minimax lower bounds for uniform estimation over Hölder classes of calibration functions. Under mild regularity conditions on the copula family and the covariate design, we show that the minimax sup-norm risk over a compact covariate region is bounded below by the classical nonparametric rate for smooth functions on an s-dimensional domain. The proof combines a localized packing construction with a Fano–Le Cam testing argument, using second-order expansions of the conditional copula likelihood to control information distances. As a consequence, local polynomial likelihood estimators achieve the minimax rate up to the logarithmic factors inherent to uniform estimation, providing a sharp optimality justification for their use in conditional copula modeling.

Keywords:

conditional copulas; local likelihood; minimax lower bounds; Hölder classes; uniform convergence; information bounds

MSC:

62G08; 62F12

1. Introduction

Copula models provide a flexible framework for modeling multivariate dependence by separating marginal distributions from the dependence structure, as guaranteed by Sklar’s theorem [1,2]. In many applications, however, the strength and form of dependence vary with observable covariates such as time, spatial location, or environmental conditions. Conditional copula models accommodate this heterogeneity by allowing the copula parameter to vary smoothly with covariates [3,4,5].

Early work in this area includes the semiparametric conditional copula model of [6], who proposed local pseudo-likelihood estimation with local polynomial approximation and established consistency, asymptotic normality, and bandwidth selection procedures. A related local likelihood framework for parametric conditional copulas was developed by [3], who derived pointwise bias and variance expressions, introduced cross-validated copula selection, and constructed confidence intervals for covariate-dependent dependence parameters. Fully nonparametric approaches were studied by [7], who analyzed the asymptotic properties of conditional copula estimators and associated dependence measures.

Let

Y \in R^{s}

denote a vector of covariates and

U = (U_{1}, \dots, U_{d}) \in {(0, 1)}^{d}

pseudo-observations obtained from continuous conditional marginals. A conditional copula model assumes that

U ∣ Y = y \sim C (\cdot ∣ θ (y)), y \in Y,

(1)

where

θ (\cdot)

is an unknown parameter function taking values in a parameter space,

Θ

.

The central inferential problem is the uniform estimation of the covariate-dependent copula parameter

θ (\cdot)

. Local likelihood methods are particularly well suited for this task: rather than modeling

θ (\cdot)

directly, one locally approximates a suitably transformed calibration function by a polynomial obtained from a Taylor expansion around each evaluation point and maximizes a kernel-weighted copula log-likelihood [3,8]. This approach, introduced by [3], has been further developed in [4]. Throughout this literature, including the present paper, the marginal distributions are assumed to be known, so that the pseudo-observations U are treated as directly observed.

The assumption of known margins is standard in theoretical analyses of conditional copula models and is appropriate in several practically relevant settings, including cases where marginal models are specified a priori, where margins can be estimated at a faster parametric rate and treated as known in a second stage, or where inference focuses primarily on covariate-dependent dependence. Under these conditions, the copula likelihood provides a valid basis for inference on

θ (\cdot)

, and marginal estimation error can be neglected at the level of first-order asymptotics. In empirical applications where marginal distributions depend on covariates, pseudo-observations that are approximately

Unif (0, 1)

conditional on

Y

can be constructed by first removing the systematic effect of

Y

from each margin and then applying an empirical CDF (rank) transformation to the adjusted observations, following [3,9,10]. This two-step adjustment ensures compatibility with the copula modeling framework while preserving the focus on covariate-dependent dependence.

From a decision-theoretic perspective, the model with unknown margins strictly contains the known-margin experiment studied in this paper as a submodel. Consequently, the corresponding minimax risk in the larger experiment cannot be smaller than that in the restricted setting analyzed here. The lower bound established in the main result of this work (Theorem 1), therefore, remains valid under unknown margins, since any estimator in the larger experiment must, in particular, operate over the known-margin submodel.

The minimax lower bound proved below is information-theoretic and does not depend on the construction of a particular estimator. The additional nuisance estimation of the marginal distributions may affect constants or higher-order terms in the uniform rate, but it cannot invalidate the lower bound itself. When the margins are estimated at a

\sqrt{N}

-consistent rate, standard two-step semiparametric arguments (see, e.g., [11]) suggest that their impact should be negligible relative to the slower nonparametric rate

{(log N / N)}^{β / (2 β + s)}

. A fully uniform-in-y treatment of the joint margin–copula experiment is beyond the scope of the present work.

Related work has addressed the testing and estimation of covariate effects in conditional copula models. For example, [12] proposed fully nonparametric tests of the simplifying assumption that covariates affect dependence only through the margins, while [10] developed score-based tests for parametric specifications of covariate effects. Earlier work by [13] established consistency and weak convergence of copula estimators under the simplifying assumption, and extensions to multivariate and functional covariates were studied in [14].

From a theoretical perspective, early analyses of local likelihood methods focused on pointwise bias, variance, and asymptotic normality at fixed covariate values [3]. However, uniform guarantees are essential in practice, as they underpin numerical stability, bandwidth selection based on global criteria, and simultaneous inference over covariate regions.

Recent work has developed a uniform asymptotic theory for kernel-weighted local likelihood estimation of covariate-dependent copula parameters, establishing uniform convergence of the local log-likelihood and its derivatives and deducing uniform consistency of the estimated parameter curve and dependence measures, such as Kendall’s

τ

[4]. The corresponding uniform upper bound for the estimation error

{sup}_{y \in Y_{0}} | \hat{θ} (y) - θ (y) |

, together with the required regularity conditions, is established in the companion paper [4]. Under standard smoothness and design conditions, the local likelihood estimator

\hat{θ}

satisfies [4]

sup_{y \in Y_{0}} |\hat{θ} (y) - θ (y)| = O_{p} (h^{p + 1} + \sqrt{\frac{log (1 / h)}{N h^{s}}}),

(2)

uniformly over compact

Y_{0} \subset int (Y)

, where the logarithmic factor arises from entropy bounds for kernel-indexed function classes [4,15].

While such uniform upper bounds provide strong guarantees, they do not establish optimality. Minimax theory characterizes the intrinsic difficulty of an estimation problem by identifying the best achievable worst-case risk over a function class. Establishing minimax lower bounds is, therefore, essential to determine whether existing procedures attain optimal rates and to identify unavoidable sources of estimation error.

The goal of this paper is to derive minimax lower bounds for the uniform estimation of the calibration function in the conditional copula model (1), under the known-margin framework described above. Working over Hölder classes of order

p + 1

, we establish a lower bound for the minimax sup-norm risk over

Y_{0}

. The resulting rate coincides with the classical minimax rate for the sup-norm estimation of a smooth regression function, showing that the nonlinear copula likelihood does not alter the fundamental global difficulty of the problem.

The proof is based on a localized packing construction combined with an information-theoretic testing argument of Fano–Le Cam type. The control of the Kullback–Leibler divergences between the induced joint distributions is achieved via second-order expansions of the conditional copula likelihood under mild curvature conditions. The argument is self-contained and relies on standard tools from empirical process theory and minimax analysis [15,16,17]. To the best of our knowledge, this is the first minimax lower bound for uniform estimation in conditional copula models.

The remainder of the paper is organized as follows. Section 2 introduces the conditional copula model and reviews the local likelihood framework that motivates our minimax analysis. Section 3 defines the smoothness class, loss function, and standing regularity assumptions. Section 4 presents the main result: a minimax lower bound for uniform estimation over compact covariate regions. Section 5 reports a simulation study illustrating the finite-sample performance of local polynomial likelihood estimators under Clayton and Gumbel copulas and compares empirical uniform errors with the minimax benchmark rate. Section 6 concludes with a discussion of the main implications and possible extensions. Appendix A contains the complete proof of the minimax lower bound, including all auxiliary lemmas and the Fano–Le Cam testing argument with Kullback–Leibler control derived from likelihood curvature. The subsequent Appendix B verifies Assumption (A3) by establishing quadratic mean differentiability and uniform Fisher information bounds for the bivariate Clayton and Gumbel copula families used in the simulation study.

2. Conditional Copula Model and Local Likelihood Estimation

This section introduces the conditional copula model and a standard local likelihood estimator for the calibration function. The estimator provides the methodological motivation for the minimax lower bound derived later. Importantly, the minimax result in Section 4 is information-theoretic and applies to any estimator of

θ (\cdot)

; it does not rely on the local likelihood form. The role of the present section is, therefore, to (i) fix notation for the statistical experiment and the estimation target, and (ii) connect the benchmark minimax rate to a concrete procedure whose uniform performance is studied in [4].

2.1. Conditional Copula Model

Let

{(U_{i}, Y_{i})}_{i = 1}^{N}

be i.i.d. observations, where

Y_{i} \in R^{s}

is a vector of covariates, and

U_{i} = (U_{i 1}, \dots, U_{i d}) \in {(0, 1)}^{d}

are pseudo-observations obtained from continuous conditional marginal distributions. Throughout, the marginals are assumed to be known, so

U_{i}

are treated as directly observed.

Fix a parametric copula family,

{C (\cdot ∣ θ) : θ \in Θ}

, with density

c (\cdot ∣ θ)

. The conditional copula model assumes that, given

Y_{i} = y

,

U_{i} ∣ Y_{i} = y \sim C (\cdot ∣ θ (y)),

(3)

where

θ (\cdot) : Y \to Θ

is an unknown calibration function governing how dependence varies with the covariate. For simplicity, we treat

θ (y)

as scalar. For

θ (y) \in R^{k}

with fixed k, the lower-bound construction extends by packing in

R^{k}

using coordinate-wise bump perturbations and measuring risk under an appropriate norm (e.g.,

ℓ_{2}

or

ℓ_{\infty}

). The analysis then requires uniform quadratic mean differentiability and positive–definite Fisher information matrices in all parameter directions. The resulting minimax rate remains

{(log N / N)}^{β / (2 β + s)}

, since k is fixed. Our inferential target is the function

θ (\cdot)

, with an emphasis on uniform accuracy over an interior region

Y_{0} \subset int (Y)

.

2.2. Local Likelihood Estimator and the Role of the Link Function

Copula parameters are typically constrained (e.g.,

θ > 0

or

θ \geq 1

), so implementation is simplified by working on an unconstrained scale. Let

ψ : Θ \to R

be a strictly monotone link, and define

ν (y) = ψ (θ (y)), θ (y) = ψ^{- 1} (ν (y))

(see also [3,4]). The link is an estimation device: it facilitates local polynomial modeling on

R

while ensuring that

\hat{θ} (y)

lies in

Θ

. In contrast, the minimax analysis in Section 3 and Section 4 and Appendix A is formulated directly for

θ (\cdot)

restricted to a compact subset

Θ_{0} ⋐ int (Θ)

, so the link function does not enter the lower-bound arguments. Since

ψ

is smooth and strictly monotone, and

Θ_{0}

is compactly contained in

int (Θ)

, its derivative is bounded and bounded away from zero on

Θ_{0}

. Hence, both

ψ

and

ψ^{- 1}

are Lipschitz on

Θ_{0}

(property also used in [4]). Consequently, the sup-norm losses in

θ

and in the transformed parameter

ν = ψ (θ)

are equivalent up to multiplicative constants. Minimax rates under sup-norm loss are, therefore, invariant under such smooth monotone reparameterizations, and the lower-bound analysis can be conducted directly in terms of

θ (\cdot)

without a loss of generality.

Let

p \geq 0

be an integer. Around a fixed evaluation point,

y \in Y

, approximate

ν (\cdot)

by a multivariate polynomial of total degree, p:

ν (Y_{i}) \approx ϕ_{p} {(Y_{i} - y)}^{⊤} γ,

where

ϕ_{p} (\cdot)

collects monomials up to degree p, and

γ

is the vector of local coefficients [4]. Given a kernel, K, and a bandwidth,

h > 0

, define

K_{h} (v) = h^{- s} K (v / h)

and the kernel-weighted local copula log-likelihood

L_{N} (γ; y, p, h) = \sum_{i = 1}^{N} log c (U_{i} | ψ^{- 1} (ϕ_{p} {(Y_{i} - y)}^{⊤} γ)) K_{h} (Y_{i} - y) .

(4)

The local maximum likelihood estimator is

\hat{γ} (y) = arg max_{γ} L_{N} (γ; y, p, h), \hat{ν} (y) = e_{0}^{⊤} \hat{γ} (y), \hat{θ} (y) = ψ^{- 1} (\hat{ν} (y)),

where

e_{0}

selects the intercept term. Solving this maximization over a grid of evaluation points yields an estimator of the full calibration curve

θ (\cdot)

.

For each evaluation point, y, the computation of

\hat{γ} (y)

requires maximization of the kernel-weighted log-likelihood (4), involving N weighted likelihood contributions. If estimation is performed over m grid points, the computational cost per bandwidth value is of order

O (m N)

, up to constants depending on the optimization routine and parameter dimension.

In the multivariate covariate setting (

s > 1

), additional cost arises from the kernel evaluation in

R^{s}

and from the larger number of grid points needed to cover the covariate domain. When

θ (y) \in R^{k}

with fixed k, the optimization cost scales linearly in k. Bandwidth selection via leave-one-out cross-validation multiplies this cost by the number of candidate bandwidths. Although the runtime increases with the covariate dimension due to standard curse-of-dimensionality effects, the procedure remains tractable for moderate s and fixed k. The uniform asymptotic properties of

\hat{θ} (\cdot)

under this framework are established in [4].

The bandwidth

h

in (4) controls the degree of smoothing in the local likelihood estimator. In the proof of the minimax lower bound (Appendix A), a separate localization scale is introduced through the support of the bump functions and is chosen proportionally to

{(log N / N)}^{\frac{1}{2 β + s}},

in order to balance the separation between alternative parameter functions and the resulting Kullback–Leibler divergences. This scaling coincides with the canonical rate governing uniform bias–variance tradeoffs for local polynomial estimators and thus links the constructive procedure in this section to the minimax benchmark established in Theorem 1.

2.3. Bandwidth Selection

Implementing

\hat{θ} (\cdot)

requires selecting both a smoothing bandwidth and a copula family. Bandwidth selection is typically carried out using leave-one-out cross-validated local likelihood. Let

{\hat{θ}}_{h} (\cdot)

denote the estimator computed with bandwidth

h

, and let

{\hat{θ}}_{h, - i} (Y_{i})

be the corresponding leave-one-out estimate evaluated at

Y_{i}

. Define the cross-validated local log-likelihood

CVL (h) = \sum_{i = 1}^{N} log c (U_{i} ∣ {\hat{θ}}_{h, - i} (Y_{i})),

(5)

and select the local-likelihood leave-one-out cross-validation (LOO-CVL) bandwidth as

h_{cv} = arg max_{h} CVL (h) .

(6)

This approach was introduced by [3] and extended to multivariate covariates in [4].

3. Model, Loss, and Regularity Conditions

Let

{(U_{i}, Y_{i})}_{i = 1}^{N}

be i.i.d. observations, where

Y_{i} \in R^{s}

denotes a vector of covariates, and

U_{i} = (U_{i 1}, \dots, U_{i d}) \in {(0, 1)}^{d}

are pseudo-observations. Conditional on

Y_{i} = y

, we assume that

U_{i}

has copula density

c (\cdot ∣ θ (y))

for a parametric family,

{c (\cdot ∣ θ) : θ \in Θ \subset R}

. For clarity, we take the copula parameter to be scalar. The analysis extends to

θ (y) \in R^{k}

with a fixed k by replacing scalar perturbations with coordinate-wise vector perturbations and imposing multivariate quadratic mean differentiability, together with uniformly positive-definite Fisher information matrices. Since k is fixed, this modification affects only constants and not the minimax rate. Our objective is the uniform estimation of the parameter function

θ (\cdot)

over a compact region

Y_{0} \subset int (Y)

.

3.1. Smoothness Class and Minimax Risk

To formalize the smoothness of the parameter function, we work over Hölder-type classes.

Definition 1

(Hölder class; [16]). Let

β > 0

and write

p = ⌊ β ⌋

and

κ = β - p \in (0, 1]

. For

L > 0

, define

H^{β} (L)

to be the collection of functions

θ : Y \to R

, such that

(i): θ has continuous partial derivatives $\partial^{α} θ$ for all multi-indices $α \in N_{0}^{s}$ with $| α | \leq p$ and

$max_{| α | \leq p} sup_{y \in Y} |\partial^{α} θ (y)| \leq L;$
(ii): the derivatives of order p are κ–Hölder continuous, i.e.,

$max_{| α | = p} sup_{\begin{matrix} y, y^{'} \in Y \\ y \neq y^{'} \end{matrix}} \frac{|\partial^{α} θ (y) - \partial^{α} θ (y^{'})|}{∥ y - y^{'} ∥^{κ}} \leq L,$

where $∥ \cdot ∥$ denotes the Euclidean norm on $R^{s}$ .

The performance of an estimator

\hat{θ} = \hat{θ} ({(U_{i}, Y_{i})}_{i = 1}^{N})

is measured using the sup-norm loss

∥ \hat{θ} {- θ ∥}_{\infty} : = sup_{y \in Y_{0}} |\hat{θ} (y) - θ (y)| .

The corresponding minimax sup-norm risk over

H^{β} (L)

is defined as

R_{N} : = inf_{\hat{θ}} sup_{θ \in H^{β} (L)} E_{θ} [∥ \hat{θ} {- θ ∥}_{\infty}],

where

E_{θ}

denotes expectation under the joint distribution induced by the conditional copula model with parameter function

θ (\cdot)

. The aim of the subsequent analysis is to derive a lower bound for

R_{N}

.

3.2. Regularity Assumptions

The minimax lower bound is proved under the following assumptions.

(A1): The covariate Y has a density $f_{Y}$ on $Y$ and there exists constants $0 < c_{f} \leq C_{f} < \infty$ , such that $c_{f} \leq f_{Y} (y) \leq C_{f}, \forall y \in Y_{0} .$
(A2): The set $Y_{0}$ is compact and contained in the interior of $Y$ , and there exists $r_{0} > 0$ , such that the closed Euclidean ball $B (y, r_{0}) \subset Y$ for all $y \in Y_{0}$ .
(A3): There exists a compact interval, $Θ_{0} \subset int (Θ)$ , such that the parametric family ${c (\cdot ∣ θ) : θ \in Θ}$ is quadratic mean-differentiable (QMD) on an open set containing $Θ_{0}$ . That is, for each $θ \in Θ_{0}$ , there exists a score function, ${\dot{ℓ}}_{θ} \in L_{2} (c (\cdot ∣ θ))$ , such that, as $t \to 0$ ,

$\int {(\sqrt{c (u ∣ θ + t)} - \sqrt{c (u ∣ θ)} - \frac{t}{2} {\dot{ℓ}}_{θ} (u) \sqrt{c (u ∣ θ)})}^{2} d u = o (t^{2}),$

uniformly in $θ \in Θ_{0}$ . Moreover, the Fisher information $I (θ) : = E_{θ} [{\dot{ℓ}}_{θ} {(U)}^{2}]$ satisfies $0 < \underset{̲}{I} \leq I (θ) \leq \bar{I} < \infty, \forall θ \in Θ_{0},$ where $E_{θ}$ denotes expectation under $U \sim c (\cdot ∣ θ)$ .
(A4): The Hölder class $H^{β} (L)$ is restricted to functions taking values in $Θ_{0}$ on $Y$ .

The verification of (A3) for the bivariate Clayton and Gumbel copula families (on any compact

Θ_{0} \subset int (Θ)

) is given in Appendix B. Assumption (A3) is a standard local regularity condition for likelihood models [18]. In particular, it yields a local quadratic control of the Kullback–Leibler divergence on

Θ_{0}

, which is the key technical tool that replaces any global bounded-likelihood-ratio requirement and allows for copula densities that are unbounded on

{(0, 1)}^{d}

.

4. Main Result: Minimax Lower Bound for Sup-Norm Risk

We establish a minimax lower bound for the uniform estimation of

θ (\cdot)

over

H^{β} (L)

. The proof follows the classical packing-plus-Fano strategy and uses only the local quadratic control of Kullback–Leibler divergences implied by QMD.

Theorem 1.

Under assumptions (A1)–(A4), let

β > 0

, and write

p = ⌊ β ⌋

and

κ = β - p \in (0, 1]

. There exists a constant,

c > 0

, depending only on

s, β, L, c_{f}, C_{f}, \underset{̲}{I}, \bar{I}

and

Θ_{0}

, such that for all sufficiently large N,

inf_{\hat{θ}} sup_{θ \in H^{β} (L)} E_{θ} [∥ \hat{θ} {- θ ∥}_{\infty}] \geq c {(\frac{log N}{N})}^{\frac{β}{2 β + s}} .

Remark 1.

The rate

{(log N / N)}^{β / (2 β + s)}

coincides with the classical minimax rate for the sup-norm estimation of a β-smooth regression function on an s-dimensional domain [16]. Theorem 1 shows that the conditional copula likelihood structure does not permit faster uniform estimation, even when

c (\cdot ∣ θ)

is unbounded on

{(0, 1)}^{d}

.

Proof outline.

The argument follows a packing construction in

H^{β} (L)

, a local quadratic Kullback–Leibler bound implied by QMD, and a testing reduction via Fano’s inequality. The complete proof (including all auxiliary lemmas) is deferred to Appendix A. □

5. Simulation Study

We illustrate the finite-sample behavior of the local likelihood estimator from Section 2 in a setting aligned with the smoothness regime of the minimax analysis. We consider the bivariate case (

d = 2

) with a single covariate (

s = 1

) and employ a local linear fit (

p = 1

), corresponding to a calibration function that is

β = p + 1 = 2

Hölder smooth.

Bandwidth choice plays a central role both in practice and in theory. In estimation, the local likelihood estimator is implemented with a smoothing bandwidth,

h

, selected by leave-one-out cross-validated local likelihood (LOO-CVL), as defined in (6). In contrast, the minimax lower-bound proof (see Appendix A) introduces a theoretical localization scale through the support radius of the bump functions, tuned at the canonical rate

{(log N / N)}^{\frac{1}{2 β + s}}

to balance the separation between alternative parameter functions and the control of the Kullback–Leibler divergence. Motivated by this construction, we define the minimax benchmark localization scale

h_{local} = c {(\frac{log N}{N})}^{\frac{1}{2 β + s}}, c = 1,

which represents the minimax-optimal spatial resolution for uniform estimation.

The reader should note that

h_{local}

is not a data-adaptive kernel smoothing bandwidth chosen for estimation but, rather, a theoretical localization scale arising from the minimax lower-bound argument. Its comparison with the data-driven LOO-CVL bandwidth

h_{cv}

is nevertheless informative, since both quantities govern the effective spatial resolution at which the local likelihood estimator can reliably detect variation in

θ (\cdot)

. Examining whether

h_{cv}

tracks this canonical rate (up to constants), therefore, provides an operational link between the minimax benchmark and a standard bandwidth selection rule used in applications.

In each replication, covariate values are generated as

Y \sim TN (0, 4; [- 2, 2])

, a truncated normal distribution with mean 0, variance 4, and support restricted to

[- 2, 2]

. Estimation and evaluation are restricted to the interior region

Y_{0} = [- 2 + δ, 2 - δ], δ = 0.20,

and carried out on an equally spaced grid of size

m = 60

to mitigate boundary effects.

Conditional on each observed covariate value, y, we generate a single pseudo-observation,

U = (U_{1}, U_{2})

, from a conditional copula model with parameter

θ (y)

. We consider both Clayton and Gumbel copula families, with smooth, non-constant calibration functions given by

θ_{Clayton} (y) = 1.5 + 0.9 sin (\frac{π}{2} (y + 2)), θ_{Gumbel} (y) = 2.0 + 0.8 sin (\frac{π}{2} (y + 2)) .

These specifications induce moderate covariate-driven variation in dependence while remaining within standard parameter ranges for each family.

Estimation is performed using the kernel-weighted local likelihood procedure described in Section 2, with the Epanechnikov kernel. For each bandwidth choice, the local likelihood is maximized at each grid point to obtain

\hat{θ} (y)

, using an unconstrained parameterization and an inverse link mapping to ensure admissibility. We compare the localization scale

h_{local}

with the data-driven leave-one-out cross-validated bandwidth

h_{cv}

defined in (6).

The experiment is repeated over

R = 100

independent replications for each sample size

N \in {50, 100, 250, 500, 750}

. Performance is evaluated on the grid over

Y_{0}

using both the sup-norm loss and a discrete

L_{2}

loss,

∥ \hat{θ} {- θ ∥}_{\infty} = max_{y \in Y_{0}} | \hat{θ} (y) - θ (y) |, L_{2} (\hat{θ}, θ) = {(\frac{1}{m} \sum_{j = 1}^{m} {(\hat{θ} (y_{j}) - θ (y_{j}))}^{2})}^{1 / 2},

where

{y_{j}}_{j = 1}^{m}

denotes the equally spaced evaluation grid on

Y_{0}

. Thus,

L_{2} (\hat{θ}, θ)

is the root mean squared pointwise error across the grid points, i.e., a Riemann-sum approximation to an integrated squared error on

Y_{0}

under the uniform measure on the grid.

To connect the numerical results to the theoretical analysis, we report the minimax benchmark rate from Theorem 1, specialized to

β = 2

and

s = 1

,

r_{N} = {(log N / N)}^{\frac{β}{2 β + s}} = {(log N / N)}^{2 / 5},

and we compare the observed sup-norm errors to

r_{N}

in rate plots.

Since minimax rates are defined only up to unknown positive multiplicative constants, agreement in slopes on a log–log scale (rather than absolute vertical alignment) is the appropriate diagnostic when comparing empirical errors with

r_{N}

.

Figure 1 shows curve recovery at

N = 750

for both copula families. The blue curve denotes the true calibration function

θ (y)

, while the orange and green curves represent the Monte Carlo mean of

\hat{θ} (y)

obtained under the theoretical localization scale

h_{local}

and the data-driven LOO-CVL bandwidth

h_{cv}

, respectively. For a fixed N,

h_{local}

is deterministic, whereas

h_{cv}

is selected in each replication. Both estimators closely track the smooth structure of

θ (y)

over the interior region

Y_{0}

.

The shaded region corresponds to a pointwise 95% confidence band constructed only for the estimator using

h_{cv}

. Following [3], the asymptotic variance of the local polynomial likelihood estimator is

Var (\hat{θ} (y)) = {(N h_{cv} f_{Y} (y) σ^{2} (y))}^{- 1} e_{0}^{⊤} S^{- 1} S^{*} S^{- 1} e_{0}

, where

σ^{2} (y) = - E [\partial_{θ}^{2} ℓ_{θ (y)} (U) ∣ Y = y]

with

ℓ_{θ} (U) = log c (U ∣ θ)

and

S, S^{*}

denote the kernel moment matrices corresponding to the local linear fit. The resulting confidence interval is

\hat{θ} (y) \pm z_{1 - α / 2} \sqrt{\hat{Var} (\hat{θ} (y))}

. The band is shown for diagnostic purposes and is not intended for uniform inference.

Table 1 reports the localization scale

h_{local}

and the Monte Carlo mean of the LOO-CVL bandwidth,

{\bar{h}}_{cv}

, together with its Monte Carlo standard deviation. As implied by its definition,

h_{local}

is deterministic for each N and decreases monotonically as N increases. In contrast, the data-driven

{\bar{h}}_{cv}

is systematically larger than

h_{local}

for both copula families, reflecting the additional smoothing preferred by the finite-sample LOO-CVL objective.

Figure 2 visualizes the theoretical localization scale

h_{local}

and the Monte Carlo mean LOO-CVL bandwidth

{\bar{h}}_{cv}

across

N \in {50, 100, 250, 500, 750}

for both copula families.

Table 2 and Table 3 report finite-sample performance based on

R = 100

Monte Carlo replications for

N = 50, 100, 250, 500, 750

. For each sample size N, the columns

{∥ \cdot ∥}_{\infty, local}

and

{∥ \cdot ∥}_{\infty, cv}

denote the Monte Carlo mean of the sup-norm loss

∥ \hat{θ} {- θ ∥}_{\infty}

evaluated on

Y_{0}

when the estimator is computed using the benchmark localization scale

h_{local}

and the data-driven LOO-CVL bandwidth

h_{cv}

, respectively. The corresponding columns

{SD}_{{∥ \cdot ∥}_{\infty, local}}

and

{SD}_{{∥ \cdot ∥}_{\infty, cv}}

report the Monte Carlo standard deviations of these losses across replications. Analogously,

L_{2, local}

and

L_{2, cv}

denote the Monte Carlo mean of the discrete

L_{2}

loss over

Y_{0}

under the two smoothing choices, with

{SD}_{L_{2, local}}

and

{SD}_{L_{2, cv}}

giving the associated Monte Carlo standard deviations.

Across both copula families, mean errors decrease monotonically with N under both

h_{local}

and

h_{cv}

, and the associated Monte Carlo standard deviations also decline, reflecting improved stability in larger samples.

For the Clayton copula,

h_{cv}

yields uniformly smaller mean errors than the theoretical localization scale

h_{local}

under both loss metrics at all reported N. The same pattern holds for the Gumbel copula: the LOO-CVL choice produces smaller mean sup-norm and

L_{2}

errors at every sample size, although the magnitude of improvement is moderate. Overall, the errors decrease monotonically with N in both families, and the LOO-CVL rule typically achieves modest gains in accuracy without inflating variability.

The decay of estimation error with N is summarized in Figure 3, Figure 4 and Figure 5. In Figure 3 and Figure 4, the vertical axis is log-scaled to highlight rate behavior, while the horizontal axis displays the raw sample size N. For both copulas and both loss metrics, the Monte Carlo mean errors decrease steadily with N. As reflected in Table 2 and Table 3, the LOO-CVL choice

h_{cv}

yields uniformly smaller mean errors than the theoretical localization scale

h_{local}

at all reported sample sizes for both Clayton and Gumbel.

Figure 5 compares the Monte Carlo mean sup norm error under the LOO CVL bandwidth to the minimax benchmark rate

r_{N} = {(log N / N)}^{2 / 5}

. In both copula families, the empirical error decays at a rate comparable to

r_{N}

across the considered sample sizes. Because minimax rates are defined up to multiplicative constants, agreement is assessed through scaling behavior, rather than vertical alignment. Finite sample constants and lower-order logarithmic factors can generate visible vertical separation, so agreement is assessed through slope, that is, the rate of decay, rather than the exact vertical coincidence. The observed slopes, therefore, support agreement with the predicted minimax scaling, while the remaining gap reflects finite sample effects.

We emphasize that minimax lower bounds describe worst-case asymptotic scaling over the smoothness class and do not determine the finite sample ordering of specific smoothing choices. Both

h_{local}

and

h_{cv}

operate at the same asymptotic rate, while the observed differences reflect finite sample constants and bias variance tradeoffs.

6. Discussion and Conclusions

This paper has established a minimax lower bound for the uniform estimation of covariate-dependent copula parameters over Hölder classes, showing that

R_{N} \geq c {(log N / N)}^{\frac{β}{2 β + s}}

for some constant

c > 0

. The resulting rate coincides with the classical minimax sup-norm rate for

β

-smooth regression functions on an s-dimensional domain. Thus, despite the nonlinear structure of the copula likelihood, the global difficulty of uniform calibration is governed solely by smoothness and dimension.

The lower-bound argument relies only on local curvature via quadratic mean differentiability and does not require globally bounded likelihood ratios, a feature that is particularly relevant for copula densities with boundary singularities. In more complex dependence settings, high-dimensional copulas are frequently constructed using vine copulas, which represent multivariate copulas through structured collections of pair-copula components [19]. When such pair-copula parameters depend on covariates, the calibration problem becomes inherently multi-parameter and structurally constrained, substantially increasing analytical complexity. Recent developments integrating vine copulas into modern machine learning frameworks, including predictive uncertainty quantification in deep neural networks [20], further underscore the need to understand statistical complexity in structured dependence models. Extending information-theoretic minimax lower bounds to covariate-dependent vine copula models, therefore, represents a natural and technically challenging direction for future research.

Combined with the uniform upper bounds established in [4], Theorem 1 implies that local polynomial likelihood estimators are minimax rate-optimal up to multiplicative constants and the unavoidable logarithmic factor inherent to uniform estimation.

Several extensions merit further investigation. First, the present analysis assumes independent observations; extending minimax lower bounds to weakly dependent settings (e.g., mixing arrays or copula-based Markov chains) would complement recent work on dependence and mixing in copula-based time series [21]. Second, multi-parameter copulas introduce additional geometric and information-theoretic complexity in both curvature control and packing constructions. Finally, establishing adaptive minimax rates under unknown smoothness, or deriving minimax results for other functionals of

θ (\cdot)

, such as tail dependence coefficients, remains an important direction for future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math14050914/s1 Python code for the simulation study, including the implementation of the kernel-weighted local likelihood estimator and the experiments for the Gumbel and Clayton copula models used in Section 5.

Author Contributions

Conceptualization, M.N.M. and C.S.A.; methodology, M.N.M.; software, M.N.M.; validation, M.N.M., O.A. and C.S.A.; formal analysis, M.N.M.; investigation, M.N.M., O.A. and C.S.A.; writing—original draft, M.N.M. and O.A.; writing—review and editing, O.A. and C.S.A.; visualization, O.A. and C.S.A.; supervision, M.N.M.; project administration, M.N.M. and C.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proofs for the Minimax Lower Bound

Appendix A.1. A Localized Packing Family

Lemma A1.

Fix

β > 0

, and write

p = ⌊ β ⌋

and

κ = β - p \in (0, 1]

. There exist constants

c_{0}, c_{1} > 0

and, for each integer

m \geq 2

, a finite family,

{θ^{(0)}, θ^{(1)}, \dots, θ^{(M)}} \subset H^{β} (L)

, with

M \geq c_{0} m^{s}

, such that:

(i): $∥ θ^{(j)} - θ^{(k)} ∥_{\infty} \geq c_{1} m^{- β}$ for all $j \neq k$ ;
(ii): $θ^{(j)} (y) \in Θ_{0}$ for all $y \in Y$ and all j.

Proof.

By assumption (A2), the compact set

Y_{0}

contains a closed cube,

Q \subset Y_{0}

, of side length

ℓ_{Q} > 0

. Partition Q into a regular grid of

m^{s}

subcubes, each of side length

ℓ_{Q} / m

, and let

{y_{1}, \dots, y_{m^{s}}}

denote their centers.

Let

φ : R^{s} \to [0, 1]

be a

C^{\infty}

bump function supported on

{[- 1 / 2, 1 / 2]}^{s}

, satisfying

φ (0) = 1

and having bounded derivatives of all orders. Define

h : = \frac{ℓ_{Q}}{2 m}, φ_{k} (y) : = φ (\frac{y - y_{k}}{h}), k = 1, \dots, m^{s} .

Since the grid spacing between distinct centers equals

ℓ_{Q} / m = 2 h

, the supports of the functions

{φ_{k}}_{k = 1}^{m^{s}}

are pairwise disjoint.

Fix

θ^{★} \in int (Θ_{0})

, and choose

δ_{0} > 0

, such that

[θ^{★} - δ_{0}, θ^{★} + δ_{0}] \subset Θ_{0}

. Define

θ^{(0)} (y) : = θ^{★}, θ^{(k)} (y) : = θ^{★} + a h^{β} φ_{k} (y), k = 1, \dots, m^{s},

where

a > 0

is a constant to be chosen below. Set

M : = m^{s}

.

For

j \neq k

, using

φ_{j} (y_{j}) = 1

and

φ_{k} (y_{j}) = 0

, we obtain

∥ θ^{(j)} - θ^{(k)} ∥_{\infty} \geq | θ^{(j)} (y_{j}) - θ^{(k)} (y_{j}) | = a h^{β} .

Since

h^{β} = {(ℓ_{Q} / 2)}^{β} m^{- β}

, property (i) holds with

c_{1} : = a {(ℓ_{Q} / 2)}^{β}

.

We next verify that

θ^{(k)} \in H^{β} (L)

. Let

α

be a multi-index with

| α | \leq p

. For

k \geq 1

,

\partial^{α} θ^{(k)} (y) = a h^{β} h^{- | α |} (\partial^{α} φ) (\frac{y - y_{k}}{h}),

so

sup_{y \in Y} | \partial^{α} θ^{(k)} (y) | \leq a h^{β - | α |} sup_{z \in R^{s}} | \partial^{α} φ (z) | \leq a sup_{z \in R^{s}} | \partial^{α} φ (z) | .

Choosing

a > 0

at a sufficiently small value (depending only on L and

φ

) ensures that all derivatives of order at most p are uniformly bounded by L.

Now, fix a multi-index,

α

, with

| α | = p

, and take arbitrary

y, y^{'} \in Y

,

y \neq y^{'}

. If both y and

y^{'}

lie outside

supp (φ_{k})

, then

\partial^{α} θ^{(k)} (y) = \partial^{α} θ^{(k)} (y^{'}) = 0

, and the Hölder quotient vanishes. If both points lie in

supp (φ_{k})

, the mean value theorem yields

|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})| \leq a h^{β - p - 1} sup_{z \in R^{s}} | \nabla \partial^{α} φ (z) | ∥ y - y^{'} ∥ .

Since

diam (supp (φ_{k})) \leq \sqrt{s} h

, dividing by

∥ y - y^{'} ∥^{κ}

gives

\frac{|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})|}{∥ y - y^{'} ∥^{κ}} \leq a {(\sqrt{s})}^{1 - κ} sup_{z \in R^{s}} | \nabla \partial^{α} φ (z) | = : c_{2} a .

It remains to consider the case in which exactly one of

y, y^{'}

lies in

supp (φ_{k})

. Without a loss of generality, assume

y \in supp (φ_{k})

and

y^{'} \notin supp (φ_{k})

, so that

\partial^{α} θ^{(k)} (y^{'}) = 0

. If

∥ y - y^{'} ∥ \geq h

, then

\frac{|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})|}{∥ y - y^{'} ∥^{κ}} \leq a h^{β - p} sup_{z} | \partial^{α} φ (z) | h^{- κ} = : c_{3} a .

If, instead

∥ y - y^{'} ∥ < h

, let

y^{″}

be a point on the line segment joining y and

y^{'}

that lies on the boundary of

supp (φ_{k})

. Then,

\partial^{α} θ^{(k)} (y^{″}) = 0

, and

∥ y - y^{″} ∥ \leq ∥ y - y^{'} ∥ < h

. Applying the mean value theorem to y and

y^{″}

yields

|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})| = |\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{″})| \leq a h^{β - p - 1} sup_{z} | \nabla \partial^{α} φ (z) | ∥ y - y^{″} ∥ .

Dividing by

∥ y - y^{'} ∥^{κ}

and using

∥ y - y^{″} ∥ \leq ∥ y - y^{'} ∥

together with

∥ y - y^{'} ∥^{1 - κ} \leq h^{1 - κ}

gives

\frac{|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})|}{∥ y - y^{'} ∥^{κ}} \leq a h^{β - p - 1} h^{1 - κ} sup_{z} | \nabla \partial^{α} φ (z) | = : c_{4} a .

Combining the above cases shows that

max_{| α | = p} sup_{y \neq y^{'}} \frac{|\partial^{α} θ^{(k)} (y) - \partial^{α} θ^{(k)} (y^{'})|}{∥ y - y^{'} ∥^{κ}} \leq c_{5} a,

for a constant,

c_{5}

, depending only on

φ

, s, and

β

. Choosing

a > 0

with a sufficiently small value ensures that the

κ

–Hölder condition holds with constant L, and hence,

θ^{(k)} \in H^{β} (L)

.

Finally, since

0 \leq φ_{k} \leq 1

, we have

| θ^{(k)} (y) - θ^{★} | \leq a h^{β} \leq a .

Choosing

a \leq δ_{0}

guarantees

θ^{(k)} (y) \in Θ_{0}

for all

y \in Y

and all k. The lemma follows with

c_{0} : = 1

. □

Appendix A.2. Local Quadratic Control of KL Divergence via QMD

We first prove a uniform local quadratic KL bound on

Θ_{0}

, and then we lift it to the joint model

(U, Y)

. To do this, we first establish a local quadratic bound on the Kullback–Leibler divergence between nearby copula densities.

Lemma A2.

There exist constants

δ > 0

and

C_{KL} < \infty

, depending only on the constants in (A3), such that, for all

ϑ, η \in Θ_{0}

with

| ϑ - η | \leq δ

,

KL (c (\cdot ∣ ϑ), c (\cdot ∣ η)) \leq C_{KL} {(ϑ - η)}^{2} .

Proof.

Write

c_{θ} (\cdot) : = c (\cdot ∣ θ)

and denote the squared Hellinger distance by

d_{H}^{2} (c_{ϑ}, c_{η}) : = \int {(\sqrt{c_{ϑ} (u)} - \sqrt{c_{η} (u)})}^{2} d u .

By quadratic mean differentiability in (A3), for each

θ \in Θ_{0}

,

d_{H}^{2} (c_{θ + t}, c_{θ}) = \frac{t^{2}}{4} I (θ) + o (t^{2}), t \to 0,

uniformly in

θ \in Θ_{0}

. Since

I (θ) \leq \bar{I}

on

Θ_{0}

, there exist

δ_{1} > 0

and

C_{h} < \infty

, such that, for all

θ \in Θ_{0}

and

| t | \leq δ_{1}

,

d_{H}^{2} (c_{θ + t}, c_{θ}) \leq C_{h} t^{2} .

In particular, for any

ϑ, η \in Θ_{0}

with

| ϑ - η | \leq δ_{1}

, taking

θ = η

and

t = ϑ - η

gives

d_{H}^{2} (c_{ϑ}, c_{η}) \leq C_{h} {(ϑ - η)}^{2} .

Next, we use the standard inequality that whenever

d_{H}^{2} (p, q) \leq 1 / 2

,

KL (p, q) \leq 4 d_{H}^{2} (p, q) .

By the continuity of

(ϑ, η) \mapsto d_{H}^{2} (c_{ϑ}, c_{η})

and the compactness of

Θ_{0}

, we may shrink

δ \in (0, δ_{1}]

if needed so that

d_{H}^{2} (c_{ϑ}, c_{η}) \leq 1 / 2

whenever

ϑ, η \in Θ_{0}

and

| ϑ - η | \leq δ

. Therefore, for such pairs,

KL (c_{ϑ}, c_{η}) \leq 4 d_{H}^{2} (c_{ϑ}, c_{η}) \leq 4 C_{h} {(ϑ - η)}^{2},

which proves the claim with

C_{KL} : = 4 C_{h}

. □

The following lemma provides a bound on the Kullback–Leibler divergence between the joint laws of a single observation under two calibration functions.

Lemma A3.

Let

θ, θ^{'}

be calibration functions taking values in

Θ_{0}

and satisfying

∥ θ - θ^{'} ∥_{\infty} \leq δ

, where δ is as in Lemma A2. Then,

KL (P_{θ}, P_{θ^{'}}) \leq C_{KL} E [{(θ (Y) - θ^{'} (Y))}^{2}],

where the expectation is with respect to

Y \sim f_{Y}

.

Proof.

The joint density of

(U, Y)

under

θ (\cdot)

is

p_{θ} (u, y) = c (u ∣ θ (y)) f_{Y} (y)

. Therefore,

\begin{matrix} KL (P_{θ}, P_{θ^{'}}) & = \int f_{Y} (y) \int c (u ∣ θ (y)) log \frac{c (u ∣ θ (y))}{c (u ∣ θ^{'} (y))} d u d y \\ = E [KL (c (\cdot ∣ θ (Y)), c (\cdot ∣ θ^{'} (Y)))] . \end{matrix}

Since

∥ θ - θ^{'} ∥_{\infty} \leq δ

, we have

| θ (Y) - θ^{'} (Y) | \leq δ

a.s., and Lemma A2 yields

KL (c (\cdot ∣ θ (Y)), c (\cdot ∣ θ^{'} (Y))) \leq C_{KL} {(θ (Y) - θ^{'} (Y))}^{2} a . s .

Taking expectations completes the proof. □

We next extend the one-observation Kullback–Leibler bound to the N-sample experiment for the packed family of alternatives constructed above.

Proposition A1.

Let

{θ^{(0)}, \dots, θ^{(M)}}

be the family constructed in Lemma A1, with

h = ℓ_{Q} / (2 m)

. Assume that m is large enough so that

∥ θ^{(j)} - θ^{(0)} ∥_{\infty} \leq δ

for all j, where δ is, as in Lemma A2. Then, there exists a constant,

C_{1} > 0

, depending only on the constants in (A1)–(A4)and on the bump function φ, such that, for all

j = 1, \dots, M

,

KL (P_{θ^{(j)}}^{\otimes N}, P_{θ^{(0)}}^{\otimes N}) \leq C_{1} N h^{2 β + s} .

Proof.

By independence,

KL (P_{θ^{(j)}}^{\otimes N}, P_{θ^{(0)}}^{\otimes N}) = N KL (P_{θ^{(j)}}, P_{θ^{(0)}}) .

Since

∥ θ^{(j)} - θ^{(0)} ∥_{\infty} \leq δ

, Lemma A3 implies

KL (P_{θ^{(j)}}^{\otimes N}, P_{θ^{(0)}}^{\otimes N}) \leq N C_{KL} E [{(θ^{(j)} (Y) - θ^{(0)} (Y))}^{2}] .

Using (A1) and the fact that

θ^{(j)} - θ^{(0)}

is supported in

Q \subset Y_{0}

, we obtain

E [{(θ^{(j)} (Y) - θ^{(0)} (Y))}^{2}] = \int_{Q} {(θ^{(j)} (y) - θ^{(0)} (y))}^{2} f_{Y} (y) d y \leq C_{f} \int_{Q} {(θ^{(j)} (y) - θ^{(0)} (y))}^{2} d y .

From the construction in Lemma A1, for each

j \geq 1

,

θ^{(j)} (y) - θ^{(0)} (y) = a h^{β} φ_{j} (y) .

Therefore,

\int_{Q} {(θ^{(j)} (y) - θ^{(0)} (y))}^{2} d y = a^{2} h^{2 β} \int φ_{j} {(y)}^{2} d y .

A change in variables

z = (y - y_{j}) / h

yields

\int φ_{j} {(y)}^{2} d y = h^{s} \int_{R^{s}} φ {(z)}^{2} d z .

Consequently,

E [{(θ^{(j)} (Y) - θ^{(0)} (Y))}^{2}] \leq C_{f} a^{2} h^{2 β + s} \int_{R^{s}} φ {(z)}^{2} d z .

Combining the above displays proves the result with

C_{1} : = C_{KL} C_{f} a^{2} \int_{R^{s}} φ {(z)}^{2} d z .

□

Appendix A.3. Fano Argument and Completion

We conclude the proof by a standard testing reduction based on Fano’s inequality.

Lemma A4.

Let

{P_{0}, \dots, P_{M}}

be probability measures, and let

\hat{J}

be any estimator of the index

J \in {0, \dots, M}

based on data from

P_{J}

. If

\frac{1}{M} \sum_{j = 1}^{M} KL (P_{j}, P_{0}) \leq α log M

for some

α \in (0, 1 / 8)

, then

{inf}_{\hat{J}} {sup}_{0 \leq j \leq M} P_{j} (\hat{J} \neq j) \geq c_{2},

for a universal constant,

c_{2} > 0

.

Proof.

See, for example, [16] Theorem 2.5. □

Proof of Theorem 1.

Fix

m \geq 2

, and let

{θ^{(0)}, \dots, θ^{(M)}}

be the family from Lemma A1, with

h = ℓ_{Q} / (2 m)

and

M = m^{s}

. Let

P_{j} : = P_{θ^{(j)}}^{\otimes N}

.

Choose

h = {(\frac{log N}{N})}^{\frac{1}{2 β + s}} and m = ⌊\frac{ℓ_{Q}}{2 h}⌋ .

Then,

h \to 0

and

m \to \infty

as

N \to \infty

. Since

∥ θ^{(j)} - θ^{(0)} ∥_{\infty} = a h^{β}

, we may choose

a > 0

with a small enough value, so that

a h^{β} \leq δ

for all sufficiently large N, ensuring that Proposition A1 applies.

By Proposition A1,

\frac{1}{M} \sum_{j = 1}^{M} KL (P_{j}, P_{0}) \leq C_{1} N h^{2 β + s} = C_{1} log N .

On the other hand,

M = m^{s}

, so

log M = s log m \geq s log (\frac{ℓ_{Q}}{4 h}) = s log (\frac{ℓ_{Q}}{4}) + s log (1 / h) .

In particular, for all sufficiently small h values, there exists

c_{3} > 0

, such that

log M \geq c_{3} log (1 / h) .

With the chosen h, we have

log (1 / h) = \frac{1}{2 β + s} log (\frac{N}{log N}) \geq c_{4} log N

for all sufficiently large N values and some constant,

c_{4} > 0

. Consequently,

log M \geq c_{5} log N

for all

N \geq N_{0}

, for some constants

c_{5} > 0

and

N_{0} \in N

.

We now enforce the Fano condition by an explicit choice of the bump amplitude. Fix

α \in (0, 1 / 8)

. Recall from Proposition A1 that

C_{1} = C_{KL} C_{f} a^{2} \int_{R^{s}} φ {(z)}^{2} d z

, so

C_{1}

is proportional to

a^{2}

. Choose

a > 0

at a sufficiently small value, so that

C_{1} \leq α c_{5}

. Then, for all

N \geq N_{0}

,

C_{1} log N \leq α c_{5} log N \leq α log M,

and hence, the condition of Lemma A4 is satisfied.

Lemma A4 then yields a constant,

c_{2} > 0

, such that

inf_{\hat{J}} sup_{0 \leq j \leq M} P_{j} (\hat{J} \neq j) \geq c_{2} .

We now reduce estimation to testing. Let

\hat{θ}

be any estimator of

θ (\cdot)

, and define

\hat{J} : = arg {min}_{0 \leq j \leq M} {∥ \hat{θ} - θ^{(j)} ∥}_{\infty},

breaking ties arbitrarily. If

\hat{J} \neq J

, then, by the triangle inequality and Lemma A1(i),

∥ \hat{θ} - θ^{(J)} ∥_{\infty} \geq \frac{1}{2} min_{k \neq J} {∥ θ^{(k)} - θ^{(J)} ∥}_{\infty} \geq \frac{1}{2} c_{1} m^{- β} = \frac{1}{2} c_{1} {(\frac{2}{ℓ_{Q}})}^{β} h^{β} .

Hence,

sup_{0 \leq j \leq M} E_{j} {∥ \hat{θ} - θ^{(j)} ∥}_{\infty} \geq \frac{1}{2} c_{1} {(\frac{2}{ℓ_{Q}})}^{β} h^{β} \cdot sup_{0 \leq j \leq M} P_{j} (\hat{J} \neq j) \geq c_{6} h^{β},

for a constant,

c_{6} > 0

, independent of N. Since

{θ^{(0)}, \dots, θ^{(M)}} \subset H^{β} (L)

, we conclude that

inf_{\hat{θ}} sup_{θ \in H^{β} (L)} E_{θ} {∥ \hat{θ} - θ ∥}_{\infty} \geq c_{6} h^{β} = c_{6} {(\frac{log N}{N})}^{\frac{β}{2 β + s}},

which completes the proof. □

Appendix B. Verification of Assumption (A3) for Clayton and Gumbel Copulas

This appendix verifies Assumption (A3) (quadratic mean differentiability and uniform Fisher information bounds on a compact parameter set) for the bivariate Clayton and Gumbel copula families. Throughout, we work with

d = 2

, which matches the simulation study and is the most common setting for conditional copula calibration. Extensions to fixed

d > 2

follow the same structure but require heavier notation.

Appendix B.1. A Convenient Sufficient Condition for QMD

Let

{p_{θ} : θ \in Θ}

be a family of densities on a measurable space,

(X, A)

, with respect to a dominating measure,

μ

, and write

s_{θ} = \sqrt{p_{θ}}

.

Lemma A5

(A sufficient condition for QMD). Assume that, for every

θ \in Θ_{0}

:

(i): $s_{θ} \in L_{2} (μ)$ , and the map $θ \mapsto s_{θ}$ is differentiable in $L_{2} (μ)$ , i.e., there exists ${\dot{s}}_{θ} \in L_{2} (μ)$ , such that

${∥\frac{s_{θ + t} - s_{θ}}{t} - {\dot{s}}_{θ}∥}_{L_{2} (μ)} \to 0 (t \to 0);$
(ii): ${\dot{s}}_{θ} = \frac{1}{2} {\dot{ℓ}}_{θ} s_{θ}$ μ-a.e. for some measurable ${\dot{ℓ}}_{θ}$ with ${\dot{ℓ}}_{θ} \in L_{2} (p_{θ})$ .

Then, the family is quadratic mean-differentiable at θ with score

{\dot{ℓ}}_{θ}

. Moreover, the Fisher information satisfies

I (θ) = E_{θ} [{\dot{ℓ}}_{θ} {(X)}^{2}] = 4 {∥ {\dot{s}}_{θ} ∥}_{L_{2} (μ)}^{2} .

Proof.

By (i),

s_{θ + t} = s_{θ} + t {\dot{s}}_{θ} + r_{t}

with

∥ r_{t} ∥_{L_{2}} = o (t)

. Thus,

{∥s_{θ + t} - s_{θ} - t {\dot{s}}_{θ}∥}_{L_{2} (μ)}^{2} = {∥ r_{t} ∥}_{L_{2} (μ)}^{2} = o (t^{2}) .

By (ii),

t {\dot{s}}_{θ} = \frac{t}{2} {\dot{ℓ}}_{θ} s_{θ}

, which is exactly the QMD expansion in Assumption (A3). Finally,

\int {(\frac{1}{2} {\dot{ℓ}}_{θ} s_{θ})}^{2} d μ = \frac{1}{4} \int {\dot{ℓ}}_{θ}^{2} p_{θ} d μ = \frac{1}{4} I (θ),

so

I (θ) = 4 ∥ {\dot{s}}_{θ} ∥_{L_{2} (μ)}^{2}

. □

In what follows,

μ

is Lebesgue measure on

{(0, 1)}^{2}

, and

p_{θ} = c (\cdot ∣ θ)

is the copula density.

Appendix B.2. Clayton Copula: QMD and Bounded Fisher Information

The bivariate Clayton copula with parameter

θ > 0

has the distribution function

C_{θ} (u, v) = {(u^{- θ} + v^{- θ} - 1)}^{- 1 / θ}, (u, v) \in {(0, 1)}^{2},

and density

c_{θ} (u, v) = (1 + θ) {(u v)}^{- 1 - θ} {(u^{- θ} + v^{- θ} - 1)}^{- 2 - 1 / θ} .

(A1)

Fix a compact interval,

Θ_{0} = [θ_{-}, θ_{+}]

, with

0 < θ_{-} < θ_{+} < \infty

.

Proposition A2.

For the bivariate Clayton family (A1) and any compact

Θ_{0} \subset (0, \infty)

:

(i): The family ${c_{θ} : θ \in Θ_{0}}$ is quadratic mean-differentiable on an open set containing $Θ_{0}$ ;
(ii): The Fisher information $I (θ) = E_{θ} [{\dot{ℓ}}_{θ} {(U, V)}^{2}]$ is finite and continuous on $Θ_{0}$ ; hence, it satisfies $0 < \underset{̲}{I} \leq I (θ) \leq \bar{I} < \infty$ on $Θ_{0}$ .

Consequently, Assumption(A3)holds for Clayton on

Θ_{0}

.

Proof.

Step 1: explicit score and a square-integrable envelope.

Write

S_{θ} (u, v) : = u^{- θ} + v^{- θ} - 1

. From (A1),

ℓ_{θ} (u, v) = log (1 + θ) - (1 + θ) (log u + log v) - (2 + \frac{1}{θ}) log S_{θ} (u, v) .

Differentiate in

θ

(for fixed

(u, v)

):

\begin{matrix} \partial_{θ} ℓ_{θ} (u, v) & = \frac{1}{1 + θ} - (log u + log v) - \partial_{θ} (2 + \frac{1}{θ}) log S_{θ} - (2 + \frac{1}{θ}) \frac{\partial_{θ} S_{θ}}{S_{θ}} \\ = \frac{1}{1 + θ} - (log u + log v) + \frac{1}{θ^{2}} log S_{θ} - (2 + \frac{1}{θ}) \frac{- u^{- θ} log u - v^{- θ} log v}{S_{θ}} . \end{matrix}

(A2)

For

θ > 0

and

(u, v) \in {(0, 1)}^{2}

, we have

u^{- θ} \geq 1

,

v^{- θ} \geq 1

; hence,

S_{θ} (u, v) \geq 1

. Therefore,

1 / S_{θ} \leq 1

, and moreover,

0 \leq \frac{u^{- θ}}{S_{θ}} \leq 1, 0 \leq \frac{v^{- θ}}{S_{θ}} \leq 1 .

Also, since

S_{θ} \leq u^{- θ} + v^{- θ}

,

log S_{θ} \leq log (u^{- θ} + v^{- θ}) \leq log 2 + log (u^{- θ}) + log (v^{- θ}) = log 2 + θ (| log u | + | log v |) .

Using these inequalities in (A2) and the compactness of

Θ_{0}

, there exists a constant,

C < \infty

(depending only on

Θ_{0}

), such that, for all

θ \in Θ_{0}

and all

(u, v) \in {(0, 1)}^{2}

,

| \partial_{θ} ℓ_{θ} (u, v) | \leq C (1 + | log u | + | log v |) .

(A3)

Now, under any bivariate copula (in particular under Clayton), the marginals are uniform:

U \sim Unif (0, 1)

and

V \sim Unif (0, 1)

. Hence,

E [{(log U)}^{2}] = \int_{0}^{1} {(log u)}^{2} d u = 2 < \infty and similarly E [{(log V)}^{2}] = 2 .

Therefore, by (A3),

sup_{θ \in Θ_{0}} E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}] \leq C^{'} E [{(1 + | log U | + | log V |)}^{2}] < \infty .

This shows that the Fisher information

I (θ)

is finite for all

θ \in Θ_{0}

.

Step 2: continuity and uniform bounds for $I (θ)$ .

For each fixed

(u, v)

,

\partial_{θ} ℓ_{θ} (u, v)

is continuous in

θ

on

Θ_{0}

(all expressions are smooth, and

S_{θ} \geq 1

avoids singularities). Moreover, by (A3), the square

{(\partial_{θ} ℓ_{θ} (u, v))}^{2}

is dominated by an integrable envelope,

C^{″} {(1 + | log u | + | log v |)}^{2}

, which does not depend on

θ

. Thus, by dominated convergence,

I (θ) = E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}]

is continuous on

Θ_{0}

. Since

I (θ) > 0

for a non-degenerate regular parametric family (the Clayton copulas are distinct for different

θ

), continuity on the compact set

Θ_{0}

implies

0 < \underset{̲}{I} : = min_{θ \in Θ_{0}} I (θ) \leq max_{θ \in Θ_{0}} I (θ) = : \bar{I} < \infty .

Step 3: QMD on $Θ_{0}$ .

Define

s_{θ} = \sqrt{c_{θ}}

. Since

c_{θ}

is a density,

s_{θ} \in L_{2}

. Formally differentiating gives

\partial_{θ} s_{θ} (u, v) = \frac{1}{2} (\partial_{θ} ℓ_{θ} (u, v)) s_{θ} (u, v) .

By Step 1,

{(\partial_{θ} ℓ_{θ})}^{2}

is integrable under

c_{θ}

uniformly over

Θ_{0}

; hence,

\int {(\partial_{θ} s_{θ})}^{2} d u d v = \frac{1}{4} E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}] = \frac{1}{4} I (θ) < \infty .

A standard mean-value expansion yields

s_{θ + t} - s_{θ} = t \partial_{θ} s_{θ + ξ t} for some ξ = ξ (u, v, t) \in (0, 1) .

Using the uniform

L_{2}

-boundedness of

\partial_{θ} s_{ϑ}

for

ϑ

near

Θ_{0}

, one obtains that

θ \mapsto s_{θ}

is differentiable in

L_{2}

, with derivative

{\dot{s}}_{θ} = \partial_{θ} s_{θ}

. (Equivalently, one can apply Lemma A5 with

{\dot{ℓ}}_{θ} = \partial_{θ} ℓ_{θ}

, justified since the envelope in (A3) gives uniform

L_{2}

control and continuity.) Therefore, QMD holds on

Θ_{0}

.

Combining Steps 1–3 completes the proof. □

Appendix B.3. Gumbel Copula: QMD and Bounded Fisher Information

The bivariate Gumbel copula has parameter

θ \geq 1

and is defined via the generator

C_{θ} (u, v) = exp (- {({(- log u)}^{θ} + {(- log v)}^{θ})}^{1 / θ}), (u, v) \in {(0, 1)}^{2} .

Fix a compact interval

Θ_{0} = [θ_{-}, θ_{+}]

with

1 < θ_{-} < θ_{+} < \infty

. (Restricting away from the boundary point

θ = 1

avoids technicalities associated with the independence limit.)

Let

x = - log u

,

y = - log v

, and

A_{θ} (x, y) = x^{θ} + y^{θ}

. The density

c_{θ}

has a known closed form; we do not need to reproduce it fully, only confirming that

c_{θ} (u, v)

is smooth in

θ

for

θ > 1

and

(u, v) \in {(0, 1)}^{2}

, and that its log-derivative admits an integrable envelope (shown below).

Proposition A3.

For the bivariate Gumbel family and any compact,

Θ_{0} \subset (1, \infty)

:

(i): The family ${c_{θ} : θ \in Θ_{0}}$ is quadratic mean-differentiable on an open set containing $Θ_{0}$ ;
(ii): The Fisher information $I (θ) = E_{θ} [{\dot{ℓ}}_{θ} {(U, V)}^{2}]$ is finite and continuous on $Θ_{0}$ ; hence, it satisfies $0 < \underset{̲}{I} \leq I (θ) \leq \bar{I} < \infty$ on $Θ_{0}$ .

Consequently, Assumption (A3) holds for Gumbel on

Θ_{0}

.

Proof.

Step 1: an integrable envelope for the score.

Set

X = - log U

and

Y = - log V

. Under any copula, the marginals of U and V are uniform; hence, X and Y are i.i.d.

Exp (1)

marginally:

P (X \in d x) = e^{- x} 1 {x > 0} d x, P (Y \in d y) = e^{- y} 1 {y > 0} d y .

In particular, X and Y have finite moments of all orders, and also

E [{(log X)}^{2}] < \infty, E [{(log Y)}^{2}] < \infty,

since

\int_{0}^{1} {(log x)}^{2} d x < \infty

and

\int_{1}^{\infty} {(log x)}^{2} e^{- x} d x < \infty

.

The log-density

ℓ_{θ} (u, v) = log c_{θ} (u, v)

is a smooth combination of terms built from X, Y,

A_{θ} (X, Y) = X^{θ} + Y^{θ}

, and

A_{θ} {(X, Y)}^{1 / θ}

, involving only addition, multiplication, logarithms, and powers. Differentiating with respect to

θ

produces terms of the schematic form

\partial_{θ} ℓ_{θ} (U, V) = \sum_{r} B_{r} (θ) T_{r} (X, Y, θ),

where each

T_{r}

is a product of factors of the types

log X, log Y, log A_{θ} (X, Y), \frac{X^{θ} log X}{A_{θ} (X, Y)}, \frac{Y^{θ} log Y}{A_{θ} (X, Y)}, \frac{log A_{θ} (X, Y)}{θ^{2}},

and polynomially bounded functions of

A_{θ} {(X, Y)}^{1 / θ}

and

1 / A_{θ} (X, Y)

. On

Θ_{0} \subset (1, \infty)

, we have

A_{θ} (X, Y) \geq X^{θ} \lor Y^{θ}

, so

0 \leq X^{θ} / A_{θ} \leq 1

and

0 \leq Y^{θ} / A_{θ} \leq 1

, and also,

A_{θ} > 0

. Moreover,

log A_{θ} (X, Y) \leq log 2 + θ ({log}^{+} X + {log}^{+} Y)

, and

{log}^{-} A_{θ} (X, Y)

can only occur when both X and Y are small, which remains integrable because

\int_{0}^{1} {| log x |}^{2} d x < \infty

.

Therefore, there exists

C < \infty

, depending only on

Θ_{0}

such that

| \partial_{θ} ℓ_{θ} (U, V) | \leq C (1 + | log X | + | log Y |) for all θ \in Θ_{0} .

(A4)

Since

{(log X)}^{2}

and

{(log Y)}^{2}

have finite expectations, (A4) implies

sup_{θ \in Θ_{0}} E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}] < \infty,

so

I (θ)

is finite on

Θ_{0}

.

Step 2: Continuity and uniform bounds for $I (θ)$ .

For each fixed

(u, v) \in {(0, 1)}^{2}

,

\partial_{θ} ℓ_{θ} (u, v)

is continuous in

θ

for

θ > 1

. The dominating envelope in (A4) is square-integrable and does not depend on

θ

. Hence, by dominated convergence,

I (θ) = E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}]

is continuous on

Θ_{0}

. The identifiability of the Gumbel family implies

I (θ) > 0

for

θ > 1

, so compactness gives

0 < \underset{̲}{I} \leq I (θ) \leq \bar{I} < \infty

on

Θ_{0}

.

Step 3: QMD on $Θ_{0}$ .

Define

s_{θ} = \sqrt{c_{θ}}

. As in the Clayton case, the smoothness of

c_{θ}

in

θ

for

θ > 1

yields

\partial_{θ} s_{θ} = \frac{1}{2} (\partial_{θ} ℓ_{θ}) s_{θ} .

Using Step 1,

\int {(\partial_{θ} s_{θ})}^{2} d u d v = \frac{1}{4} E_{θ} [{(\partial_{θ} ℓ_{θ} (U, V))}^{2}] = \frac{1}{4} I (θ) < \infty,

uniformly on

Θ_{0}

. The same

L_{2}

-mean-value argument then shows

θ \mapsto s_{θ}

is differentiable in

L_{2}

on

Θ_{0}

; hence, QMD holds by Lemma A5 with score

{\dot{ℓ}}_{θ} = \partial_{θ} ℓ_{θ}

.

This proves (A3) for the bivariate Gumbel family on

Θ_{0}

. □

Remark A1 (Why we restrict away from the boundary points

θ = 0

and

θ = 1

).

The compact restriction

Θ_{0} \subset int (Θ)

in Assumption (A3) is standard in local asymptotic theory. For Clayton,

θ ↓ 0

approaches the independence copula and can create degeneracies in curvature. For Gumbel,

θ ↓ 1

also approaches independence and similarly requires separate treatment. In the conditional copula setting, such boundary regimes are typically excluded by construction (e.g., via a link function and compact range restriction).

References

Nelsen, R.B. An Introduction to Copulas; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
Sklar, M. Fonctions de repartition à n dimensions et leurs marges. Ann. L’ISUP 1959, 8, 229–231. [Google Scholar]
Acar, E.F.; Craiu, R.V.; Yao, F. Dependence calibration in conditional copulas: A nonparametric approach. Biometrics 2011, 67, 445–453. [Google Scholar] [CrossRef] [PubMed]
Muia, M.N. Uniform Asymptotic Theory for Local Likelihood Estimation of Covariate-Dependent Copula Parameters. arXiv 2026, arXiv:2601.01345. [Google Scholar] [CrossRef]
Patton, A.J. Modelling asymmetric exchange rate dependence. Int. Econ. Rev. 2006, 47, 527–556. [Google Scholar] [CrossRef]
Abegaz, F.; Gijbels, I.; Veraverbeke, N. Semiparametric estimation of conditional copulas. J. Multivar. Anal. 2012, 110, 43–73. [Google Scholar] [CrossRef]
Veraverbeke, N.; Omelka, M.; Gijbels, I. Estimation of a conditional copula and association measures. Scand. J. Stat. 2011, 38, 766–780. [Google Scholar] [CrossRef]
Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman & Hall: London, UK, 1996; Volume 66. [Google Scholar]
Gijbels, I.; Veraverbeke, N.; Omelka, M. Partial and average copulas and association measures. Electron. J. Stat. 2015, 9, 2420–2474. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Pešta, M.; Veraverbeke, N. Score tests for covariate effects in conditional copulas. J. Multivar. Anal. 2017, 159, 111–133. [Google Scholar] [CrossRef]
Newey, W.; McFadden, D. Large sample estimation and hypothesis testing. In Handbook of Econometrics; Elsevier: Amsterdam, The Netherlands, 1994; Volume 4, pp. 2111–2245. [Google Scholar]
Gijbels, I.; Omelka, M.; Veraverbeke, N. Nonparametric testing for no covariate effects in conditional copulas. Statistics 2017, 51, 475–509. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Veraverbeke, N. Estimation of a copula when a covariate affects only marginal distributions. Scand. J. Stat. 2015, 42, 1109–1126. [Google Scholar] [CrossRef]
Gijbels, I.; Omelka, M.; Veraverbeke, N. Multivariate and functional covariates and conditional copulas. Electron. J. Stat. 2012, 6, 1273–1306. [Google Scholar] [CrossRef]
van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes: With Applications to Statistics; Springer: Berlin/ Heidelberg, Germany, 1996. [Google Scholar]
Tsybakov, A.B. Introduction to Nonparametric Estimation; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Yu, B. Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam; Springer: Berlin/Heidelberg, Germany, 1997; pp. 423–435. [Google Scholar]
van der Vaart, A.W. Asymptotic Statistics; Cambridge University Press: Cambridge, UK, 2000; Volume 3. [Google Scholar]
Czado, C.; Nagler, T. Vine copula based modeling. Annu. Rev. Stat. Its Appl. 2022, 9, 453–477. [Google Scholar] [CrossRef]
Cheng, T.; Lesmana, N.S.; Poreddy, S.R.; Chen, K. Predictive Uncertainty Quantification for Financial DNN Using Regular Vine Copula. In Proceedings of the 6th ACM International Conference on AI in Finance, Singapore, 15–18 November 2025; pp. 873–881. [Google Scholar]
Muia, M. Dependence and Mixing for Perturbations of Copula-Based Markov Chains. Ph.D. Thesis, University of Mississippi, Ann Arbor, MI, USA, 2024. [Google Scholar]

Figure 1. Curve recovery for

p = 1

(

β = 2

,

s = 1

) at

N = 750

: (a) Clayton and (b) Gumbel copulas. In each panel, the blue curve denotes the true calibration function

θ (y)

on

Y_{0}

, while the orange and green curves give the Monte Carlo mean of

\hat{θ} (y)

under

h_{local}

and the LOO-CVL bandwidth

h_{cv}

, respectively. The shaded region shows the pointwise 95% confidence interval for the estimator based on

h_{cv}

.

Figure 1. Curve recovery for

p = 1

(

β = 2

,

s = 1

) at

N = 750

: (a) Clayton and (b) Gumbel copulas. In each panel, the blue curve denotes the true calibration function

θ (y)

on

Y_{0}

, while the orange and green curves give the Monte Carlo mean of

\hat{θ} (y)

under

h_{local}

and the LOO-CVL bandwidth

h_{cv}

, respectively. The shaded region shows the pointwise 95% confidence interval for the estimator based on

h_{cv}

.

Figure 2. Theoretical localization scale

h_{local}

and Monte Carlo mean LOO-CVL bandwidth

{\bar{h}}_{cv}

for

p = 1

(

β = 2

,

s = 1

). Panel (a): Clayton; panel (b): Gumbel.

Figure 2. Theoretical localization scale

h_{local}

and Monte Carlo mean LOO-CVL bandwidth

{\bar{h}}_{cv}

for

p = 1

(

β = 2

,

s = 1

). Panel (a): Clayton; panel (b): Gumbel.

Figure 3. Monte Carlo mean sup-norm error

∥ \hat{θ} {- θ ∥}_{\infty}

for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to

h_{local}

and the orange curve to

h_{cv}

. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Figure 3. Monte Carlo mean sup-norm error

∥ \hat{θ} {- θ ∥}_{\infty}

for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to

h_{local}

and the orange curve to

h_{cv}

. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Figure 4. Monte Carlo mean discrete

L_{2}

error for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to

h_{local}

and the orange curve to

h_{cv}

. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Figure 4. Monte Carlo mean discrete

L_{2}

error for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to

h_{local}

and the orange curve to

h_{cv}

. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Figure 5. Monte Carlo mean sup norm error under

h_{cv}

(blue) and the benchmark rate

r_{N} = {(log N / N)}^{2 / 5}

(orange) for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Figure 5. Monte Carlo mean sup norm error under

h_{cv}

(blue) and the benchmark rate

r_{N} = {(log N / N)}^{2 / 5}

(orange) for

p = 1

(

β = 2

,

s = 1

;

R = 100

). Panel (a): Clayton; panel (b): Gumbel. The vertical axis is log-scaled and

N \in {50, 100, 250, 500, 750}

.

Table 1. Comparison of the theoretical localization scale

h_{local}

and the data-driven LOO-CVL bandwidth

h_{cv}

for

p = 1

(

β = 2

,

s = 1

) based on

R = 100

Monte Carlo replications. For each N,

h_{local}

is deterministic, whereas the reported

{\bar{h}}_{cv}

and its accompanying standard deviation summarize the Monte Carlo mean and variability of the LOO-CVL bandwidth across replications.

Table 1. Comparison of the theoretical localization scale

h_{local}

and the data-driven LOO-CVL bandwidth

h_{cv}

for

p = 1

(

β = 2

,

s = 1

) based on

R = 100

Monte Carlo replications. For each N,

h_{local}

is deterministic, whereas the reported

{\bar{h}}_{cv}

and its accompanying standard deviation summarize the Monte Carlo mean and variability of the LOO-CVL bandwidth across replications.

	Clayton			Gumbel
$N$	$h_{local}$	${\bar{h}}_{cv}$	${SD}_{{\bar{h}}_{cv}}$	$h_{local}$	${\bar{h}}_{cv}$	${SD}_{{\bar{h}}_{cv}}$
50	0.6007	0.7578	0.1537	0.6007	1.2657	0.4851
100	0.5403	0.7075	0.1382	0.5403	1.1485	0.4007
250	0.4665	0.6227	0.1046	0.4665	0.8710	0.3231
500	0.4158	0.5457	0.1018	0.4158	0.7022	0.2181
750	0.3883	0.5263	0.0767	0.3883	0.6433	0.1935

Table 2. Clayton copula (

p = 1

,

β = 2

): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete

L_{2}

loss over

Y_{0}

, based on

R = 100

replications, comparing

h_{local}

to the LOO-CVL bandwidth.

Table 2. Clayton copula (

p = 1

,

β = 2

): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete

L_{2}

loss over

Y_{0}

, based on

R = 100

replications, comparing

h_{local}

to the LOO-CVL bandwidth.

N	${∥ \cdot ∥}_{\infty, local}$	${SD}_{{∥ \cdot ∥}_{\infty, local}}$	${∥ \cdot ∥}_{\infty, cv}$	${SD}_{{∥ \cdot ∥}_{\infty, cv}}$
50	3.6432	2.2898	3.4876	2.3962
100	2.3577	1.8444	2.0054	1.6909
250	1.0002	0.5554	0.8960	0.5577
500	0.6564	0.2460	0.6160	0.2777
750	0.5419	0.1957	0.4767	0.2050
$N$	$L_{2, local}$	${SD}_{L_{2, local}}$	$L_{2, cv}$	${SD}_{L_{2, cv}}$
50	1.8661	1.7432	1.7981	1.9313
100	1.0141	1.2549	0.8771	1.0933
250	0.3932	0.1261	0.3576	0.1175
500	0.2730	0.0828	0.2559	0.0953
750	0.2259	0.0595	0.2042	0.0672

Table 3. Gumbel copula (

p = 1

,

β = 2

): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete

L_{2}

loss over

Y_{0}

, based on

R = 100

replications, comparing

h_{local}

to the LOO-CVL bandwidth.

Table 3. Gumbel copula (

p = 1

,

β = 2

): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete

L_{2}

loss over

Y_{0}

, based on

R = 100

replications, comparing

h_{local}

to the LOO-CVL bandwidth.

N	${∥ \cdot ∥}_{\infty, local}$	${SD}_{{∥ \cdot ∥}_{\infty, local}}$	${∥ \cdot ∥}_{\infty, cv}$	${SD}_{{∥ \cdot ∥}_{\infty, cv}}$
50	2.4712	0.8193	1.8459	1.0658
100	1.5030	0.8249	1.0787	0.8057
250	0.7559	0.4800	0.6738	0.5623
500	0.4679	0.1568	0.4434	0.3390
750	0.3722	0.1291	0.3446	0.1514
$N$	$L_{2, local}$	${SD}_{L_{2, local}}$	$L_{2, cv}$	${SD}_{L_{2, cv}}$
50	0.8973	0.3512	0.7337	0.4290
100	0.5150	0.2349	0.4424	0.2665
250	0.2875	0.1183	0.2632	0.1438
500	0.1967	0.0531	0.1838	0.0885
750	0.1549	0.0422	0.1479	0.0510

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Muia, M.N.; Atutey, O.; Abeykoon, C.S. Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics 2026, 14, 914. https://doi.org/10.3390/math14050914

AMA Style

Muia MN, Atutey O, Abeykoon CS. Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics. 2026; 14(5):914. https://doi.org/10.3390/math14050914

Chicago/Turabian Style

Muia, Mathias Nthiani, Olivia Atutey, and Chathurika Srimali Abeykoon. 2026. "Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters" Mathematics 14, no. 5: 914. https://doi.org/10.3390/math14050914

APA Style

Muia, M. N., Atutey, O., & Abeykoon, C. S. (2026). Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics, 14(5), 914. https://doi.org/10.3390/math14050914

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters

Abstract

1. Introduction

2. Conditional Copula Model and Local Likelihood Estimation

2.1. Conditional Copula Model

2.2. Local Likelihood Estimator and the Role of the Link Function

2.3. Bandwidth Selection

3. Model, Loss, and Regularity Conditions

3.1. Smoothness Class and Minimax Risk

3.2. Regularity Assumptions

4. Main Result: Minimax Lower Bound for Sup-Norm Risk

5. Simulation Study

6. Discussion and Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Proofs for the Minimax Lower Bound

Appendix A.1. A Localized Packing Family

Appendix A.2. Local Quadratic Control of KL Divergence via QMD

Appendix A.3. Fano Argument and Completion

Appendix B. Verification of Assumption (A3) for Clayton and Gumbel Copulas

Appendix B.1. A Convenient Sufficient Condition for QMD

Appendix B.2. Clayton Copula: QMD and Bounded Fisher Information

Appendix B.3. Gumbel Copula: QMD and Bounded Fisher Information

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI