Next Article in Journal
An LM-Type Unit Root Test for Functional Time Series
Previous Article in Journal
ETR: Event-Centric Temporal Reasoning for Question-Conditioned Video Question Answering
Previous Article in Special Issue
Comparative Study of Estimation Methods for a New Family of Copula-Based Reversible Markov Chains
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters

by
Mathias Nthiani Muia
1,*,
Olivia Atutey
1 and
Chathurika Srimali Abeykoon
2
1
Department of Mathematics and Statistics, University of South Alabama, Mobile, AL 36688, USA
2
Department of Mathematics and Statistics, Rhodes College, Memphis, TN 38112, USA
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(5), 914; https://doi.org/10.3390/math14050914
Submission received: 9 February 2026 / Revised: 4 March 2026 / Accepted: 6 March 2026 / Published: 8 March 2026
(This article belongs to the Special Issue Advances in Probability Theory and Stochastic Analysis)

Abstract

Local likelihood methods are widely used to estimate calibration functions in conditional copula models. Recent work has established uniform stochastic equicontinuity and uniform convergence rates for local likelihood estimators of covariate-dependent copula parameters, yielding global consistency guarantees and supporting the stability of local optimization routines. This paper complements those results by deriving minimax lower bounds for uniform estimation over Hölder classes of calibration functions. Under mild regularity conditions on the copula family and the covariate design, we show that the minimax sup-norm risk over a compact covariate region is bounded below by the classical nonparametric rate for smooth functions on an s-dimensional domain. The proof combines a localized packing construction with a Fano–Le Cam testing argument, using second-order expansions of the conditional copula likelihood to control information distances. As a consequence, local polynomial likelihood estimators achieve the minimax rate up to the logarithmic factors inherent to uniform estimation, providing a sharp optimality justification for their use in conditional copula modeling.

1. Introduction

Copula models provide a flexible framework for modeling multivariate dependence by separating marginal distributions from the dependence structure, as guaranteed by Sklar’s theorem [1,2]. In many applications, however, the strength and form of dependence vary with observable covariates such as time, spatial location, or environmental conditions. Conditional copula models accommodate this heterogeneity by allowing the copula parameter to vary smoothly with covariates [3,4,5].
Early work in this area includes the semiparametric conditional copula model of [6], who proposed local pseudo-likelihood estimation with local polynomial approximation and established consistency, asymptotic normality, and bandwidth selection procedures. A related local likelihood framework for parametric conditional copulas was developed by [3], who derived pointwise bias and variance expressions, introduced cross-validated copula selection, and constructed confidence intervals for covariate-dependent dependence parameters. Fully nonparametric approaches were studied by [7], who analyzed the asymptotic properties of conditional copula estimators and associated dependence measures.
Let Y R s denote a vector of covariates and U = ( U 1 , , U d ) ( 0 , 1 ) d pseudo-observations obtained from continuous conditional marginals. A conditional copula model assumes that
U Y = y C ( · θ ( y ) ) , y Y ,
where θ ( · ) is an unknown parameter function taking values in a parameter space, Θ .
The central inferential problem is the uniform estimation of the covariate-dependent copula parameter θ ( · ) . Local likelihood methods are particularly well suited for this task: rather than modeling θ ( · ) directly, one locally approximates a suitably transformed calibration function by a polynomial obtained from a Taylor expansion around each evaluation point and maximizes a kernel-weighted copula log-likelihood [3,8]. This approach, introduced by [3], has been further developed in [4]. Throughout this literature, including the present paper, the marginal distributions are assumed to be known, so that the pseudo-observations U are treated as directly observed.
The assumption of known margins is standard in theoretical analyses of conditional copula models and is appropriate in several practically relevant settings, including cases where marginal models are specified a priori, where margins can be estimated at a faster parametric rate and treated as known in a second stage, or where inference focuses primarily on covariate-dependent dependence. Under these conditions, the copula likelihood provides a valid basis for inference on θ ( · ) , and marginal estimation error can be neglected at the level of first-order asymptotics. In empirical applications where marginal distributions depend on covariates, pseudo-observations that are approximately Unif ( 0 , 1 ) conditional on Y can be constructed by first removing the systematic effect of Y from each margin and then applying an empirical CDF (rank) transformation to the adjusted observations, following [3,9,10]. This two-step adjustment ensures compatibility with the copula modeling framework while preserving the focus on covariate-dependent dependence.
From a decision-theoretic perspective, the model with unknown margins strictly contains the known-margin experiment studied in this paper as a submodel. Consequently, the corresponding minimax risk in the larger experiment cannot be smaller than that in the restricted setting analyzed here. The lower bound established in the main result of this work (Theorem 1), therefore, remains valid under unknown margins, since any estimator in the larger experiment must, in particular, operate over the known-margin submodel.
The minimax lower bound proved below is information-theoretic and does not depend on the construction of a particular estimator. The additional nuisance estimation of the marginal distributions may affect constants or higher-order terms in the uniform rate, but it cannot invalidate the lower bound itself. When the margins are estimated at a N -consistent rate, standard two-step semiparametric arguments (see, e.g., [11]) suggest that their impact should be negligible relative to the slower nonparametric rate ( log N / N ) β / ( 2 β + s ) . A fully uniform-in-y treatment of the joint margin–copula experiment is beyond the scope of the present work.
Related work has addressed the testing and estimation of covariate effects in conditional copula models. For example, [12] proposed fully nonparametric tests of the simplifying assumption that covariates affect dependence only through the margins, while [10] developed score-based tests for parametric specifications of covariate effects. Earlier work by [13] established consistency and weak convergence of copula estimators under the simplifying assumption, and extensions to multivariate and functional covariates were studied in [14].
From a theoretical perspective, early analyses of local likelihood methods focused on pointwise bias, variance, and asymptotic normality at fixed covariate values [3]. However, uniform guarantees are essential in practice, as they underpin numerical stability, bandwidth selection based on global criteria, and simultaneous inference over covariate regions.
Recent work has developed a uniform asymptotic theory for kernel-weighted local likelihood estimation of covariate-dependent copula parameters, establishing uniform convergence of the local log-likelihood and its derivatives and deducing uniform consistency of the estimated parameter curve and dependence measures, such as Kendall’s τ [4]. The corresponding uniform upper bound for the estimation error sup y Y 0 | θ ^ ( y ) θ ( y ) | , together with the required regularity conditions, is established in the companion paper [4]. Under standard smoothness and design conditions, the local likelihood estimator θ ^ satisfies [4]
sup y Y 0 θ ^ ( y ) θ ( y ) = O p h p + 1 + log ( 1 / h ) N h s ,
uniformly over compact Y 0 int ( Y ) , where the logarithmic factor arises from entropy bounds for kernel-indexed function classes [4,15].
While such uniform upper bounds provide strong guarantees, they do not establish optimality. Minimax theory characterizes the intrinsic difficulty of an estimation problem by identifying the best achievable worst-case risk over a function class. Establishing minimax lower bounds is, therefore, essential to determine whether existing procedures attain optimal rates and to identify unavoidable sources of estimation error.
The goal of this paper is to derive minimax lower bounds for the uniform estimation of the calibration function in the conditional copula model (1), under the known-margin framework described above. Working over Hölder classes of order p + 1 , we establish a lower bound for the minimax sup-norm risk over Y 0 . The resulting rate coincides with the classical minimax rate for the sup-norm estimation of a smooth regression function, showing that the nonlinear copula likelihood does not alter the fundamental global difficulty of the problem.
The proof is based on a localized packing construction combined with an information-theoretic testing argument of Fano–Le Cam type. The control of the Kullback–Leibler divergences between the induced joint distributions is achieved via second-order expansions of the conditional copula likelihood under mild curvature conditions. The argument is self-contained and relies on standard tools from empirical process theory and minimax analysis [15,16,17]. To the best of our knowledge, this is the first minimax lower bound for uniform estimation in conditional copula models.
The remainder of the paper is organized as follows. Section 2 introduces the conditional copula model and reviews the local likelihood framework that motivates our minimax analysis. Section 3 defines the smoothness class, loss function, and standing regularity assumptions. Section 4 presents the main result: a minimax lower bound for uniform estimation over compact covariate regions. Section 5 reports a simulation study illustrating the finite-sample performance of local polynomial likelihood estimators under Clayton and Gumbel copulas and compares empirical uniform errors with the minimax benchmark rate. Section 6 concludes with a discussion of the main implications and possible extensions. Appendix A contains the complete proof of the minimax lower bound, including all auxiliary lemmas and the Fano–Le Cam testing argument with Kullback–Leibler control derived from likelihood curvature. The subsequent Appendix B verifies Assumption (A3) by establishing quadratic mean differentiability and uniform Fisher information bounds for the bivariate Clayton and Gumbel copula families used in the simulation study.

2. Conditional Copula Model and Local Likelihood Estimation

This section introduces the conditional copula model and a standard local likelihood estimator for the calibration function. The estimator provides the methodological motivation for the minimax lower bound derived later. Importantly, the minimax result in Section 4 is information-theoretic and applies to any estimator of θ ( · ) ; it does not rely on the local likelihood form. The role of the present section is, therefore, to (i) fix notation for the statistical experiment and the estimation target, and (ii) connect the benchmark minimax rate to a concrete procedure whose uniform performance is studied in [4].

2.1. Conditional Copula Model

Let { ( U i , Y i ) } i = 1 N be i.i.d. observations, where Y i R s is a vector of covariates, and U i = ( U i 1 , , U i d ) ( 0 , 1 ) d are pseudo-observations obtained from continuous conditional marginal distributions. Throughout, the marginals are assumed to be known, so U i are treated as directly observed.
Fix a parametric copula family, { C ( · θ ) : θ Θ } , with density c ( · θ ) . The conditional copula model assumes that, given Y i = y ,
U i Y i = y C ( · θ ( y ) ) ,
where θ ( · ) : Y Θ is an unknown calibration function governing how dependence varies with the covariate. For simplicity, we treat θ ( y ) as scalar. For θ ( y ) R k with fixed k, the lower-bound construction extends by packing in R k using coordinate-wise bump perturbations and measuring risk under an appropriate norm (e.g., 2 or ). The analysis then requires uniform quadratic mean differentiability and positive–definite Fisher information matrices in all parameter directions. The resulting minimax rate remains ( log N / N ) β / ( 2 β + s ) , since k is fixed. Our inferential target is the function θ ( · ) , with an emphasis on uniform accuracy over an interior region Y 0 int ( Y ) .

2.2. Local Likelihood Estimator and the Role of the Link Function

Copula parameters are typically constrained (e.g., θ > 0 or θ 1 ), so implementation is simplified by working on an unconstrained scale. Let ψ : Θ R be a strictly monotone link, and define ν ( y ) = ψ ( θ ( y ) ) , θ ( y ) = ψ 1 ( ν ( y ) ) (see also [3,4]). The link is an estimation device: it facilitates local polynomial modeling on R while ensuring that θ ^ ( y ) lies in Θ . In contrast, the minimax analysis in Section 3 and Section 4 and Appendix A is formulated directly for θ ( · ) restricted to a compact subset Θ 0 int ( Θ ) , so the link function does not enter the lower-bound arguments. Since ψ is smooth and strictly monotone, and Θ 0 is compactly contained in int ( Θ ) , its derivative is bounded and bounded away from zero on Θ 0 . Hence, both ψ and ψ 1 are Lipschitz on Θ 0 (property also used in [4]). Consequently, the sup-norm losses in θ and in the transformed parameter ν = ψ ( θ ) are equivalent up to multiplicative constants. Minimax rates under sup-norm loss are, therefore, invariant under such smooth monotone reparameterizations, and the lower-bound analysis can be conducted directly in terms of θ ( · ) without a loss of generality.
Let p 0 be an integer. Around a fixed evaluation point, y Y , approximate ν ( · ) by a multivariate polynomial of total degree, p: ν ( Y i ) ϕ p ( Y i y ) γ , where ϕ p ( · ) collects monomials up to degree p, and γ is the vector of local coefficients [4]. Given a kernel, K, and a bandwidth, h > 0 , define K h ( v ) = h s K ( v / h ) and the kernel-weighted local copula log-likelihood
L N ( γ ; y , p , h ) = i = 1 N log c U i | ψ 1 ϕ p ( Y i y ) γ K h ( Y i y ) .
The local maximum likelihood estimator is
γ ^ ( y ) = arg max γ L N ( γ ; y , p , h ) , ν ^ ( y ) = e 0 γ ^ ( y ) , θ ^ ( y ) = ψ 1 ( ν ^ ( y ) ) ,
where e 0 selects the intercept term. Solving this maximization over a grid of evaluation points yields an estimator of the full calibration curve θ ( · ) .
For each evaluation point, y, the computation of γ ^ ( y ) requires maximization of the kernel-weighted log-likelihood (4), involving N weighted likelihood contributions. If estimation is performed over m grid points, the computational cost per bandwidth value is of order O ( m N ) , up to constants depending on the optimization routine and parameter dimension.
In the multivariate covariate setting ( s > 1 ), additional cost arises from the kernel evaluation in R s and from the larger number of grid points needed to cover the covariate domain. When θ ( y ) R k with fixed k, the optimization cost scales linearly in k. Bandwidth selection via leave-one-out cross-validation multiplies this cost by the number of candidate bandwidths. Although the runtime increases with the covariate dimension due to standard curse-of-dimensionality effects, the procedure remains tractable for moderate s and fixed k. The uniform asymptotic properties of θ ^ ( · ) under this framework are established in [4].
The bandwidth h in (4) controls the degree of smoothing in the local likelihood estimator. In the proof of the minimax lower bound (Appendix A), a separate localization scale is introduced through the support of the bump functions and is chosen proportionally to log N / N 1 2 β + s , in order to balance the separation between alternative parameter functions and the resulting Kullback–Leibler divergences. This scaling coincides with the canonical rate governing uniform bias–variance tradeoffs for local polynomial estimators and thus links the constructive procedure in this section to the minimax benchmark established in Theorem 1.

2.3. Bandwidth Selection

Implementing θ ^ ( · ) requires selecting both a smoothing bandwidth and a copula family. Bandwidth selection is typically carried out using leave-one-out cross-validated local likelihood. Let θ ^ h ( · ) denote the estimator computed with bandwidth h , and let θ ^ h , i ( Y i ) be the corresponding leave-one-out estimate evaluated at Y i . Define the cross-validated local log-likelihood
CVL ( h ) = i = 1 N log c U i θ ^ h , i ( Y i ) ,
and select the local-likelihood leave-one-out cross-validation (LOO-CVL) bandwidth as
h cv = arg max h CVL ( h ) .
This approach was introduced by [3] and extended to multivariate covariates in [4].

3. Model, Loss, and Regularity Conditions

Let ( U i , Y i ) i = 1 N be i.i.d. observations, where Y i R s denotes a vector of covariates, and U i = ( U i 1 , , U i d ) ( 0 , 1 ) d are pseudo-observations. Conditional on Y i = y , we assume that U i has copula density c ( · θ ( y ) ) for a parametric family, { c ( · θ ) : θ Θ R } . For clarity, we take the copula parameter to be scalar. The analysis extends to θ ( y ) R k with a fixed k by replacing scalar perturbations with coordinate-wise vector perturbations and imposing multivariate quadratic mean differentiability, together with uniformly positive-definite Fisher information matrices. Since k is fixed, this modification affects only constants and not the minimax rate. Our objective is the uniform estimation of the parameter function θ ( · ) over a compact region Y 0 int ( Y ) .

3.1. Smoothness Class and Minimax Risk

To formalize the smoothness of the parameter function, we work over Hölder-type classes.
Definition 1 
(Hölder class; [16]). Let β > 0 and write p = β and κ = β p ( 0 , 1 ] . For L > 0 , define H β ( L ) to be the collection of functions θ : Y R , such that
(i)  
θ has continuous partial derivatives α θ for all multi-indices α N 0 s with | α | p and
max | α | p sup y Y α θ ( y ) L ;
(ii) 
the derivatives of order p are κ–Hölder continuous, i.e.,
max | α | = p sup y , y Y y y α θ ( y ) α θ ( y ) y y κ L ,
where · denotes the Euclidean norm on R s .
The performance of an estimator θ ^ = θ ^ ( U i , Y i ) i = 1 N is measured using the sup-norm loss
θ ^ θ : = sup y Y 0 θ ^ ( y ) θ ( y ) .
The corresponding minimax sup-norm risk over H β ( L ) is defined as
R N : = inf θ ^ sup θ H β ( L ) E θ θ ^ θ ,
where E θ denotes expectation under the joint distribution induced by the conditional copula model with parameter function θ ( · ) . The aim of the subsequent analysis is to derive a lower bound for R N .

3.2. Regularity Assumptions

The minimax lower bound is proved under the following assumptions.
(A1)
The covariate Y has a density f Y on Y and there exists constants 0 < c f C f < , such that c f f Y ( y ) C f , y Y 0 .
(A2)
The set Y 0 is compact and contained in the interior of Y , and there exists r 0 > 0 , such that the closed Euclidean ball B ( y , r 0 ) Y for all y Y 0 .
(A3)
There exists a compact interval, Θ 0 int ( Θ ) , such that the parametric family { c ( · θ ) : θ Θ } is quadratic mean-differentiable (QMD) on an open set containing Θ 0 . That is, for each θ Θ 0 , there exists a score function, ˙ θ L 2 ( c ( · θ ) ) , such that, as t 0 ,
c ( u θ + t ) c ( u θ ) t 2 ˙ θ ( u ) c ( u θ ) 2 d u = o ( t 2 ) ,
uniformly in θ Θ 0 . Moreover, the Fisher information I ( θ ) : = E θ ˙ θ ( U ) 2 satisfies 0 < I ̲ I ( θ ) I ¯ < , θ Θ 0 , where E θ denotes expectation under U c ( · θ ) .
(A4)
The Hölder class H β ( L ) is restricted to functions taking values in Θ 0 on Y .
The verification of (A3) for the bivariate Clayton and Gumbel copula families (on any compact Θ 0 int ( Θ ) ) is given in Appendix B. Assumption (A3) is a standard local regularity condition for likelihood models [18]. In particular, it yields a local quadratic control of the Kullback–Leibler divergence on Θ 0 , which is the key technical tool that replaces any global bounded-likelihood-ratio requirement and allows for copula densities that are unbounded on ( 0 , 1 ) d .

4. Main Result: Minimax Lower Bound for Sup-Norm Risk

We establish a minimax lower bound for the uniform estimation of θ ( · ) over H β ( L ) . The proof follows the classical packing-plus-Fano strategy and uses only the local quadratic control of Kullback–Leibler divergences implied by QMD.
Theorem 1. 
Under assumptions (A1)(A4), let β > 0 , and write p = β and κ = β p ( 0 , 1 ] . There exists a constant, c > 0 , depending only on s , β , L , c f , C f , I ̲ , I ¯ and Θ 0 , such that for all sufficiently large N,
inf θ ^ sup θ H β ( L ) E θ θ ^ θ c log N N β 2 β + s .
Remark 1. 
The rate ( log N / N ) β / ( 2 β + s ) coincides with the classical minimax rate for the sup-norm estimation of a β-smooth regression function on an s-dimensional domain [16]. Theorem 1 shows that the conditional copula likelihood structure does not permit faster uniform estimation, even when c ( · θ ) is unbounded on ( 0 , 1 ) d .
Proof outline. 
The argument follows a packing construction in H β ( L ) , a local quadratic Kullback–Leibler bound implied by QMD, and a testing reduction via Fano’s inequality. The complete proof (including all auxiliary lemmas) is deferred to Appendix A. □

5. Simulation Study

We illustrate the finite-sample behavior of the local likelihood estimator from Section 2 in a setting aligned with the smoothness regime of the minimax analysis. We consider the bivariate case ( d = 2 ) with a single covariate ( s = 1 ) and employ a local linear fit ( p = 1 ), corresponding to a calibration function that is β = p + 1 = 2 Hölder smooth.
Bandwidth choice plays a central role both in practice and in theory. In estimation, the local likelihood estimator is implemented with a smoothing bandwidth, h , selected by leave-one-out cross-validated local likelihood (LOO-CVL), as defined in (6). In contrast, the minimax lower-bound proof (see Appendix A) introduces a theoretical localization scale through the support radius of the bump functions, tuned at the canonical rate log N / N 1 2 β + s to balance the separation between alternative parameter functions and the control of the Kullback–Leibler divergence. Motivated by this construction, we define the minimax benchmark localization scale
h local = c log N N 1 2 β + s , c = 1 ,
which represents the minimax-optimal spatial resolution for uniform estimation.
The reader should note that h local is not a data-adaptive kernel smoothing bandwidth chosen for estimation but, rather, a theoretical localization scale arising from the minimax lower-bound argument. Its comparison with the data-driven LOO-CVL bandwidth h cv is nevertheless informative, since both quantities govern the effective spatial resolution at which the local likelihood estimator can reliably detect variation in θ ( · ) . Examining whether h cv tracks this canonical rate (up to constants), therefore, provides an operational link between the minimax benchmark and a standard bandwidth selection rule used in applications.
In each replication, covariate values are generated as Y TN ( 0 , 4 ; [ 2 , 2 ] ) , a truncated normal distribution with mean 0, variance 4, and support restricted to [ 2 , 2 ] . Estimation and evaluation are restricted to the interior region Y 0 = [ 2 + δ , 2 δ ] , δ = 0.20 , and carried out on an equally spaced grid of size m = 60 to mitigate boundary effects.
Conditional on each observed covariate value, y, we generate a single pseudo-observation, U = ( U 1 , U 2 ) , from a conditional copula model with parameter θ ( y ) . We consider both Clayton and Gumbel copula families, with smooth, non-constant calibration functions given by
θ Clayton ( y ) = 1.5 + 0.9 sin π 2 ( y + 2 ) , θ Gumbel ( y ) = 2.0 + 0.8 sin π 2 ( y + 2 ) .
These specifications induce moderate covariate-driven variation in dependence while remaining within standard parameter ranges for each family.
Estimation is performed using the kernel-weighted local likelihood procedure described in Section 2, with the Epanechnikov kernel. For each bandwidth choice, the local likelihood is maximized at each grid point to obtain θ ^ ( y ) , using an unconstrained parameterization and an inverse link mapping to ensure admissibility. We compare the localization scale h local with the data-driven leave-one-out cross-validated bandwidth h cv defined in (6).
The experiment is repeated over R = 100 independent replications for each sample size N { 50 , 100 , 250 , 500 , 750 } . Performance is evaluated on the grid over Y 0 using both the sup-norm loss and a discrete L 2 loss,
θ ^ θ = max y Y 0 | θ ^ ( y ) θ ( y ) | , L 2 ( θ ^ , θ ) = 1 m j = 1 m θ ^ ( y j ) θ ( y j ) 2 1 / 2 ,
where { y j } j = 1 m denotes the equally spaced evaluation grid on Y 0 . Thus, L 2 ( θ ^ , θ ) is the root mean squared pointwise error across the grid points, i.e., a Riemann-sum approximation to an integrated squared error on Y 0 under the uniform measure on the grid.
To connect the numerical results to the theoretical analysis, we report the minimax benchmark rate from Theorem 1, specialized to β = 2 and s = 1 , r N = log N / N β 2 β + s = log N / N 2 / 5 , and we compare the observed sup-norm errors to r N in rate plots.
Since minimax rates are defined only up to unknown positive multiplicative constants, agreement in slopes on a log–log scale (rather than absolute vertical alignment) is the appropriate diagnostic when comparing empirical errors with r N .
Figure 1 shows curve recovery at N = 750 for both copula families. The blue curve denotes the true calibration function θ ( y ) , while the orange and green curves represent the Monte Carlo mean of θ ^ ( y ) obtained under the theoretical localization scale h local and the data-driven LOO-CVL bandwidth h cv , respectively. For a fixed N, h local is deterministic, whereas h cv is selected in each replication. Both estimators closely track the smooth structure of θ ( y ) over the interior region Y 0 .
The shaded region corresponds to a pointwise 95% confidence band constructed only for the estimator using h cv . Following [3], the asymptotic variance of the local polynomial likelihood estimator is Var ( θ ^ ( y ) ) = N h cv f Y ( y ) σ 2 ( y ) 1 e 0 S 1 S S 1 e 0 , where σ 2 ( y ) = E θ 2 θ ( y ) ( U ) Y = y with θ ( U ) = log c ( U θ ) and S , S denote the kernel moment matrices corresponding to the local linear fit. The resulting confidence interval is θ ^ ( y ) ± z 1 α / 2 Var ^ ( θ ^ ( y ) ) . The band is shown for diagnostic purposes and is not intended for uniform inference.
Table 1 reports the localization scale h local and the Monte Carlo mean of the LOO-CVL bandwidth, h ¯ cv , together with its Monte Carlo standard deviation. As implied by its definition, h local is deterministic for each N and decreases monotonically as N increases. In contrast, the data-driven h ¯ cv is systematically larger than h local for both copula families, reflecting the additional smoothing preferred by the finite-sample LOO-CVL objective.
Figure 2 visualizes the theoretical localization scale h local and the Monte Carlo mean LOO-CVL bandwidth h ¯ cv across N { 50 , 100 , 250 , 500 , 750 } for both copula families.
Table 2 and Table 3 report finite-sample performance based on R = 100 Monte Carlo replications for N = 50 , 100 , 250 , 500 , 750 . For each sample size N, the columns · , local and · , cv denote the Monte Carlo mean of the sup-norm loss θ ^ θ evaluated on Y 0 when the estimator is computed using the benchmark localization scale h local and the data-driven LOO-CVL bandwidth h cv , respectively. The corresponding columns SD · , local and SD · , cv report the Monte Carlo standard deviations of these losses across replications. Analogously, L 2 , local and L 2 , cv denote the Monte Carlo mean of the discrete L 2 loss over Y 0 under the two smoothing choices, with SD L 2 , local and SD L 2 , cv giving the associated Monte Carlo standard deviations.
Across both copula families, mean errors decrease monotonically with N under both h local and h cv , and the associated Monte Carlo standard deviations also decline, reflecting improved stability in larger samples.
For the Clayton copula, h cv yields uniformly smaller mean errors than the theoretical localization scale h local under both loss metrics at all reported N. The same pattern holds for the Gumbel copula: the LOO-CVL choice produces smaller mean sup-norm and L 2 errors at every sample size, although the magnitude of improvement is moderate. Overall, the errors decrease monotonically with N in both families, and the LOO-CVL rule typically achieves modest gains in accuracy without inflating variability.
The decay of estimation error with N is summarized in Figure 3, Figure 4 and Figure 5. In Figure 3 and Figure 4, the vertical axis is log-scaled to highlight rate behavior, while the horizontal axis displays the raw sample size N. For both copulas and both loss metrics, the Monte Carlo mean errors decrease steadily with N. As reflected in Table 2 and Table 3, the LOO-CVL choice h cv yields uniformly smaller mean errors than the theoretical localization scale h local at all reported sample sizes for both Clayton and Gumbel.
Figure 5 compares the Monte Carlo mean sup norm error under the LOO CVL bandwidth to the minimax benchmark rate r N = ( log N / N ) 2 / 5 . In both copula families, the empirical error decays at a rate comparable to r N across the considered sample sizes. Because minimax rates are defined up to multiplicative constants, agreement is assessed through scaling behavior, rather than vertical alignment. Finite sample constants and lower-order logarithmic factors can generate visible vertical separation, so agreement is assessed through slope, that is, the rate of decay, rather than the exact vertical coincidence. The observed slopes, therefore, support agreement with the predicted minimax scaling, while the remaining gap reflects finite sample effects.
We emphasize that minimax lower bounds describe worst-case asymptotic scaling over the smoothness class and do not determine the finite sample ordering of specific smoothing choices. Both h local and h cv operate at the same asymptotic rate, while the observed differences reflect finite sample constants and bias variance tradeoffs.

6. Discussion and Conclusions

This paper has established a minimax lower bound for the uniform estimation of covariate-dependent copula parameters over Hölder classes, showing that R N c log N / N β 2 β + s for some constant c > 0 . The resulting rate coincides with the classical minimax sup-norm rate for β -smooth regression functions on an s-dimensional domain. Thus, despite the nonlinear structure of the copula likelihood, the global difficulty of uniform calibration is governed solely by smoothness and dimension.
The lower-bound argument relies only on local curvature via quadratic mean differentiability and does not require globally bounded likelihood ratios, a feature that is particularly relevant for copula densities with boundary singularities. In more complex dependence settings, high-dimensional copulas are frequently constructed using vine copulas, which represent multivariate copulas through structured collections of pair-copula components [19]. When such pair-copula parameters depend on covariates, the calibration problem becomes inherently multi-parameter and structurally constrained, substantially increasing analytical complexity. Recent developments integrating vine copulas into modern machine learning frameworks, including predictive uncertainty quantification in deep neural networks [20], further underscore the need to understand statistical complexity in structured dependence models. Extending information-theoretic minimax lower bounds to covariate-dependent vine copula models, therefore, represents a natural and technically challenging direction for future research.
Combined with the uniform upper bounds established in [4], Theorem 1 implies that local polynomial likelihood estimators are minimax rate-optimal up to multiplicative constants and the unavoidable logarithmic factor inherent to uniform estimation.
Several extensions merit further investigation. First, the present analysis assumes independent observations; extending minimax lower bounds to weakly dependent settings (e.g., mixing arrays or copula-based Markov chains) would complement recent work on dependence and mixing in copula-based time series [21]. Second, multi-parameter copulas introduce additional geometric and information-theoretic complexity in both curvature control and packing constructions. Finally, establishing adaptive minimax rates under unknown smoothness, or deriving minimax results for other functionals of θ ( · ) , such as tail dependence coefficients, remains an important direction for future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math14050914/s1 Python code for the simulation study, including the implementation of the kernel-weighted local likelihood estimator and the experiments for the Gumbel and Clayton copula models used in Section 5.

Author Contributions

Conceptualization, M.N.M. and C.S.A.; methodology, M.N.M.; software, M.N.M.; validation, M.N.M., O.A. and C.S.A.; formal analysis, M.N.M.; investigation, M.N.M., O.A. and C.S.A.; writing—original draft, M.N.M. and O.A.; writing—review and editing, O.A. and C.S.A.; visualization, O.A. and C.S.A.; supervision, M.N.M.; project administration, M.N.M. and C.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the Supplementary Materials. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proofs for the Minimax Lower Bound

Appendix A.1. A Localized Packing Family

Lemma A1. 
Fix β > 0 , and write p = β and κ = β p ( 0 , 1 ] . There exist constants c 0 , c 1 > 0 and, for each integer m 2 , a finite family, { θ ( 0 ) , θ ( 1 ) , , θ ( M ) } H β ( L ) , with M c 0 m s , such that:
(i)  
θ ( j ) θ ( k ) c 1 m β for all j k ;
(ii) 
θ ( j ) ( y ) Θ 0 for all y Y and all j.
Proof. 
By assumption (A2), the compact set Y 0 contains a closed cube, Q Y 0 , of side length Q > 0 . Partition Q into a regular grid of m s subcubes, each of side length Q / m , and let { y 1 , , y m s } denote their centers.
Let φ : R s [ 0 , 1 ] be a C bump function supported on [ 1 / 2 , 1 / 2 ] s , satisfying φ ( 0 ) = 1 and having bounded derivatives of all orders. Define
h : = Q 2 m , φ k ( y ) : = φ y y k h , k = 1 , , m s .
Since the grid spacing between distinct centers equals Q / m = 2 h , the supports of the functions { φ k } k = 1 m s are pairwise disjoint.
Fix θ int ( Θ 0 ) , and choose δ 0 > 0 , such that [ θ δ 0 , θ + δ 0 ] Θ 0 . Define θ ( 0 ) ( y ) : = θ , θ ( k ) ( y ) : = θ + a h β φ k ( y ) , k = 1 , , m s , where a > 0 is a constant to be chosen below. Set M : = m s .
For j k , using φ j ( y j ) = 1 and φ k ( y j ) = 0 , we obtain θ ( j ) θ ( k ) | θ ( j ) ( y j ) θ ( k ) ( y j ) | = a h β . Since h β = ( Q / 2 ) β m β , property (i) holds with c 1 : = a ( Q / 2 ) β .
We next verify that θ ( k ) H β ( L ) . Let α be a multi-index with | α | p . For k 1 ,
α θ ( k ) ( y ) = a h β h | α | ( α φ ) y y k h ,
so
sup y Y | α θ ( k ) ( y ) | a h β | α | sup z R s | α φ ( z ) | a sup z R s | α φ ( z ) | .
Choosing a > 0 at a sufficiently small value (depending only on L and φ ) ensures that all derivatives of order at most p are uniformly bounded by L.
Now, fix a multi-index, α , with | α | = p , and take arbitrary y , y Y , y y . If both y and y lie outside supp ( φ k ) , then α θ ( k ) ( y ) = α θ ( k ) ( y ) = 0 , and the Hölder quotient vanishes. If both points lie in supp ( φ k ) , the mean value theorem yields
α θ ( k ) ( y ) α θ ( k ) ( y ) a h β p 1 sup z R s | α φ ( z ) | y y .
Since diam ( supp ( φ k ) ) s h , dividing by y y κ gives
α θ ( k ) ( y ) α θ ( k ) ( y ) y y κ a ( s ) 1 κ sup z R s | α φ ( z ) | = : c 2 a .
It remains to consider the case in which exactly one of y , y lies in supp ( φ k ) . Without a loss of generality, assume y supp ( φ k ) and y supp ( φ k ) , so that α θ ( k ) ( y ) = 0 . If y y h , then
α θ ( k ) ( y ) α θ ( k ) ( y ) y y κ a h β p sup z | α φ ( z ) | h κ = : c 3 a .
If, instead y y < h , let y be a point on the line segment joining y and y that lies on the boundary of supp ( φ k ) . Then, α θ ( k ) ( y ) = 0 , and y y y y < h . Applying the mean value theorem to y and y yields
α θ ( k ) ( y ) α θ ( k ) ( y ) = α θ ( k ) ( y ) α θ ( k ) ( y ) a h β p 1 sup z | α φ ( z ) | y y .
Dividing by y y κ and using y y y y together with y y 1 κ h 1 κ gives
α θ ( k ) ( y ) α θ ( k ) ( y ) y y κ a h β p 1 h 1 κ sup z | α φ ( z ) | = : c 4 a .
Combining the above cases shows that
max | α | = p sup y y α θ ( k ) ( y ) α θ ( k ) ( y ) y y κ c 5 a ,
for a constant, c 5 , depending only on φ , s, and β . Choosing a > 0 with a sufficiently small value ensures that the κ –Hölder condition holds with constant L, and hence, θ ( k ) H β ( L ) .
Finally, since 0 φ k 1 , we have | θ ( k ) ( y ) θ | a h β a . Choosing a δ 0 guarantees θ ( k ) ( y ) Θ 0 for all y Y and all k. The lemma follows with c 0 : = 1 . □

Appendix A.2. Local Quadratic Control of KL Divergence via QMD

We first prove a uniform local quadratic KL bound on Θ 0 , and then we lift it to the joint model ( U , Y ) . To do this, we first establish a local quadratic bound on the Kullback–Leibler divergence between nearby copula densities.
Lemma A2. 
There exist constants δ > 0 and C KL < , depending only on the constants in (A3), such that, for all ϑ , η Θ 0 with | ϑ η | δ , KL c ( · ϑ ) , c ( · η ) C KL ( ϑ η ) 2 .
Proof. 
Write c θ ( · ) : = c ( · θ ) and denote the squared Hellinger distance by
d H 2 ( c ϑ , c η ) : = c ϑ ( u ) c η ( u ) 2 d u .
By quadratic mean differentiability in (A3), for each θ Θ 0 ,
d H 2 ( c θ + t , c θ ) = t 2 4 I ( θ ) + o ( t 2 ) , t 0 ,
uniformly in θ Θ 0 . Since I ( θ ) I ¯ on Θ 0 , there exist δ 1 > 0 and C h < , such that, for all θ Θ 0 and | t | δ 1 , d H 2 ( c θ + t , c θ ) C h t 2 . In particular, for any ϑ , η Θ 0 with | ϑ η | δ 1 , taking θ = η and t = ϑ η gives d H 2 ( c ϑ , c η ) C h ( ϑ η ) 2 .
Next, we use the standard inequality that whenever d H 2 ( p , q ) 1 / 2 , KL ( p , q ) 4 d H 2 ( p , q ) . By the continuity of ( ϑ , η ) d H 2 ( c ϑ , c η ) and the compactness of Θ 0 , we may shrink δ ( 0 , δ 1 ] if needed so that d H 2 ( c ϑ , c η ) 1 / 2 whenever ϑ , η Θ 0 and | ϑ η | δ . Therefore, for such pairs,
KL ( c ϑ , c η ) 4 d H 2 ( c ϑ , c η ) 4 C h ( ϑ η ) 2 ,
which proves the claim with C KL : = 4 C h . □
The following lemma provides a bound on the Kullback–Leibler divergence between the joint laws of a single observation under two calibration functions.
Lemma A3. 
Let θ , θ be calibration functions taking values in Θ 0 and satisfying θ θ δ , where δ is as in Lemma A2. Then, KL ( P θ , P θ ) C KL E ( θ ( Y ) θ ( Y ) ) 2 , where the expectation is with respect to Y f Y .
Proof. 
The joint density of ( U , Y ) under θ ( · ) is p θ ( u , y ) = c ( u θ ( y ) ) f Y ( y ) . Therefore,
KL ( P θ , P θ ) = f Y ( y ) c ( u θ ( y ) ) log c ( u θ ( y ) ) c ( u θ ( y ) ) d u d y = E KL c ( · θ ( Y ) ) , c ( · θ ( Y ) ) .
Since θ θ δ , we have | θ ( Y ) θ ( Y ) | δ a.s., and Lemma A2 yields
KL c ( · θ ( Y ) ) , c ( · θ ( Y ) ) C KL ( θ ( Y ) θ ( Y ) ) 2 a . s .
Taking expectations completes the proof. □
We next extend the one-observation Kullback–Leibler bound to the N-sample experiment for the packed family of alternatives constructed above.
Proposition A1. 
Let { θ ( 0 ) , , θ ( M ) } be the family constructed in Lemma A1, with h = Q / ( 2 m ) . Assume that m is large enough so that θ ( j ) θ ( 0 ) δ for all j, where δ is, as in Lemma A2. Then, there exists a constant, C 1 > 0 , depending only on the constants in (A1)(A4)and on the bump function φ, such that, for all j = 1 , , M , KL P θ ( j ) N , P θ ( 0 ) N C 1 N h 2 β + s .
Proof. 
By independence, KL P θ ( j ) N , P θ ( 0 ) N = N KL ( P θ ( j ) , P θ ( 0 ) ) . Since θ ( j ) θ ( 0 ) δ , Lemma A3 implies KL P θ ( j ) N , P θ ( 0 ) N N C KL E ( θ ( j ) ( Y ) θ ( 0 ) ( Y ) ) 2 . Using (A1) and the fact that θ ( j ) θ ( 0 ) is supported in Q Y 0 , we obtain
E ( θ ( j ) ( Y ) θ ( 0 ) ( Y ) ) 2 = Q ( θ ( j ) ( y ) θ ( 0 ) ( y ) ) 2 f Y ( y ) d y C f Q ( θ ( j ) ( y ) θ ( 0 ) ( y ) ) 2 d y .
From the construction in Lemma A1, for each j 1 , θ ( j ) ( y ) θ ( 0 ) ( y ) = a h β φ j ( y ) . Therefore,
Q ( θ ( j ) ( y ) θ ( 0 ) ( y ) ) 2 d y = a 2 h 2 β φ j ( y ) 2 d y .
A change in variables z = ( y y j ) / h yields
φ j ( y ) 2 d y = h s R s φ ( z ) 2 d z .
Consequently,
E ( θ ( j ) ( Y ) θ ( 0 ) ( Y ) ) 2 C f a 2 h 2 β + s R s φ ( z ) 2 d z .
Combining the above displays proves the result with
C 1 : = C KL C f a 2 R s φ ( z ) 2 d z .

Appendix A.3. Fano Argument and Completion

We conclude the proof by a standard testing reduction based on Fano’s inequality.
Lemma A4. 
Let { P 0 , , P M } be probability measures, and let J ^ be any estimator of the index J { 0 , , M } based on data from P J . If
1 M j = 1 M KL ( P j , P 0 ) α log M
for some α ( 0 , 1 / 8 ) , then inf J ^ sup 0 j M P j ( J ^ j ) c 2 , for a universal constant, c 2 > 0 .
Proof. 
See, for example, [16] Theorem 2.5. □
Proof of Theorem 1. 
Fix m 2 , and let { θ ( 0 ) , , θ ( M ) } be the family from Lemma A1, with h = Q / ( 2 m ) and M = m s . Let P j : = P θ ( j ) N .
Choose
h = log N N 1 2 β + s and m = Q 2 h .
Then, h 0 and m as N . Since θ ( j ) θ ( 0 ) = a h β , we may choose a > 0 with a small enough value, so that a h β δ for all sufficiently large N, ensuring that Proposition A1 applies.
By Proposition A1,
1 M j = 1 M KL ( P j , P 0 ) C 1 N h 2 β + s = C 1 log N .
On the other hand, M = m s , so
log M = s log m s log Q 4 h = s log Q 4 + s log ( 1 / h ) .
In particular, for all sufficiently small h values, there exists c 3 > 0 , such that
log M c 3 log ( 1 / h ) .
With the chosen h, we have
log ( 1 / h ) = 1 2 β + s log N log N c 4 log N
for all sufficiently large N values and some constant, c 4 > 0 . Consequently, log M c 5 log N for all N N 0 , for some constants c 5 > 0 and N 0 N .
We now enforce the Fano condition by an explicit choice of the bump amplitude. Fix α ( 0 , 1 / 8 ) . Recall from Proposition A1 that C 1 = C KL C f a 2 R s φ ( z ) 2 d z , so C 1 is proportional to a 2 . Choose a > 0 at a sufficiently small value, so that C 1 α c 5 . Then, for all N N 0 ,
C 1 log N α c 5 log N α log M ,
and hence, the condition of Lemma A4 is satisfied.
Lemma A4 then yields a constant, c 2 > 0 , such that
inf J ^ sup 0 j M P j ( J ^ j ) c 2 .
We now reduce estimation to testing. Let θ ^ be any estimator of θ ( · ) , and define J ^ : = arg min 0 j M θ ^ θ ( j ) , breaking ties arbitrarily. If J ^ J , then, by the triangle inequality and Lemma A1(i),
θ ^ θ ( J ) 1 2 min k J θ ( k ) θ ( J ) 1 2 c 1 m β = 1 2 c 1 2 Q β h β .
Hence,
sup 0 j M E j θ ^ θ ( j ) 1 2 c 1 2 Q β h β · sup 0 j M P j ( J ^ j ) c 6 h β ,
for a constant, c 6 > 0 , independent of N. Since { θ ( 0 ) , , θ ( M ) } H β ( L ) , we conclude that
inf θ ^ sup θ H β ( L ) E θ θ ^ θ c 6 h β = c 6 log N N β 2 β + s ,
which completes the proof. □

Appendix B. Verification of Assumption (A3) for Clayton and Gumbel Copulas

This appendix verifies Assumption (A3) (quadratic mean differentiability and uniform Fisher information bounds on a compact parameter set) for the bivariate Clayton and Gumbel copula families. Throughout, we work with d = 2 , which matches the simulation study and is the most common setting for conditional copula calibration. Extensions to fixed d > 2 follow the same structure but require heavier notation.

Appendix B.1. A Convenient Sufficient Condition for QMD

Let { p θ : θ Θ } be a family of densities on a measurable space, ( X , A ) , with respect to a dominating measure, μ , and write s θ = p θ .
Lemma A5 
(A sufficient condition for QMD). Assume that, for every θ Θ 0 :
(i)  
s θ L 2 ( μ ) , and the map θ s θ is differentiable in L 2 ( μ ) , i.e., there exists s ˙ θ L 2 ( μ ) , such that
s θ + t s θ t s ˙ θ L 2 ( μ ) 0 ( t 0 ) ;
(ii) 
s ˙ θ = 1 2 ˙ θ s θ μ-a.e. for some measurable ˙ θ with ˙ θ L 2 ( p θ ) .
Then, the family is quadratic mean-differentiable at θ with score ˙ θ . Moreover, the Fisher information satisfies
I ( θ ) = E θ [ ˙ θ ( X ) 2 ] = 4 s ˙ θ L 2 ( μ ) 2 .
Proof. 
By (i), s θ + t = s θ + t s ˙ θ + r t with r t L 2 = o ( t ) . Thus,
s θ + t s θ t s ˙ θ L 2 ( μ ) 2 = r t L 2 ( μ ) 2 = o ( t 2 ) .
By (ii), t s ˙ θ = t 2 ˙ θ s θ , which is exactly the QMD expansion in Assumption (A3). Finally,
1 2 ˙ θ s θ 2 d μ = 1 4 ˙ θ 2 p θ d μ = 1 4 I ( θ ) ,
so I ( θ ) = 4 s ˙ θ L 2 ( μ ) 2 . □
In what follows, μ is Lebesgue measure on ( 0 , 1 ) 2 , and p θ = c ( · θ ) is the copula density.

Appendix B.2. Clayton Copula: QMD and Bounded Fisher Information

The bivariate Clayton copula with parameter θ > 0 has the distribution function
C θ ( u , v ) = u θ + v θ 1 1 / θ , ( u , v ) ( 0 , 1 ) 2 ,
and density
c θ ( u , v ) = ( 1 + θ ) ( u v ) 1 θ u θ + v θ 1 2 1 / θ .
Fix a compact interval, Θ 0 = [ θ , θ + ] , with 0 < θ < θ + < .
Proposition A2. 
For the bivariate Clayton family (A1and any compact  Θ 0 ( 0 , ) :
(i)  
The family { c θ : θ Θ 0 } is quadratic mean-differentiable on an open set containing Θ 0 ;
(ii) 
The Fisher information I ( θ ) = E θ [ ˙ θ ( U , V ) 2 ] is finite and continuous on Θ 0 ; hence, it satisfies 0 < I ̲ I ( θ ) I ¯ < on Θ 0 .
Consequently, Assumption(A3)holds for Clayton on Θ 0 .
Proof. 
Step 1: explicit score and a square-integrable envelope.
Write S θ ( u , v ) : = u θ + v θ 1 . From (A1),
θ ( u , v ) = log ( 1 + θ ) ( 1 + θ ) ( log u + log v ) 2 + 1 θ log S θ ( u , v ) .
Differentiate in θ (for fixed ( u , v ) ):
θ θ ( u , v ) = 1 1 + θ ( log u + log v ) θ 2 + 1 θ log S θ 2 + 1 θ θ S θ S θ = 1 1 + θ ( log u + log v ) + 1 θ 2 log S θ 2 + 1 θ u θ log u v θ log v S θ .
For θ > 0 and ( u , v ) ( 0 , 1 ) 2 , we have u θ 1 , v θ 1 ; hence, S θ ( u , v ) 1 . Therefore, 1 / S θ 1 , and moreover,
0 u θ S θ 1 , 0 v θ S θ 1 .
Also, since S θ u θ + v θ ,
log S θ log u θ + v θ log 2 + log ( u θ ) + log ( v θ ) = log 2 + θ ( | log u | + | log v | ) .
Using these inequalities in (A2) and the compactness of Θ 0 , there exists a constant, C < (depending only on Θ 0 ), such that, for all θ Θ 0 and all ( u , v ) ( 0 , 1 ) 2 ,
| θ θ ( u , v ) | C 1 + | log u | + | log v | .
Now, under any bivariate copula (in particular under Clayton), the marginals are uniform: U Unif ( 0 , 1 ) and V Unif ( 0 , 1 ) . Hence,
E [ ( log U ) 2 ] = 0 1 ( log u ) 2 d u = 2 < and similarly E [ ( log V ) 2 ] = 2 .
Therefore, by (A3),
sup θ Θ 0 E θ ( θ θ ( U , V ) ) 2 C E 1 + | log U | + | log V | 2 < .
This shows that the Fisher information I ( θ ) is finite for all θ Θ 0 .
  • Step 2: continuity and uniform bounds for I ( θ ) .
For each fixed ( u , v ) , θ θ ( u , v ) is continuous in θ on Θ 0 (all expressions are smooth, and S θ 1 avoids singularities). Moreover, by (A3), the square ( θ θ ( u , v ) ) 2 is dominated by an integrable envelope, C ( 1 + | log u | + | log v | ) 2 , which does not depend on θ . Thus, by dominated convergence,
I ( θ ) = E θ ( θ θ ( U , V ) ) 2
is continuous on Θ 0 . Since I ( θ ) > 0 for a non-degenerate regular parametric family (the Clayton copulas are distinct for different θ ), continuity on the compact set Θ 0 implies
0 < I ̲ : = min θ Θ 0 I ( θ ) max θ Θ 0 I ( θ ) = : I ¯ < .
  • Step 3: QMD on Θ 0 .
Define s θ = c θ . Since c θ is a density, s θ L 2 . Formally differentiating gives
θ s θ ( u , v ) = 1 2 ( θ θ ( u , v ) ) s θ ( u , v ) .
By Step 1, ( θ θ ) 2 is integrable under c θ uniformly over Θ 0 ; hence,
( θ s θ ) 2 d u d v = 1 4 E θ [ ( θ θ ( U , V ) ) 2 ] = 1 4 I ( θ ) < .
A standard mean-value expansion yields
s θ + t s θ = t θ s θ + ξ t for some ξ = ξ ( u , v , t ) ( 0 , 1 ) .
Using the uniform L 2 -boundedness of θ s ϑ for ϑ near Θ 0 , one obtains that θ s θ is differentiable in L 2 , with derivative s ˙ θ = θ s θ . (Equivalently, one can apply Lemma A5 with ˙ θ = θ θ , justified since the envelope in (A3) gives uniform L 2 control and continuity.) Therefore, QMD holds on Θ 0 .
Combining Steps 1–3 completes the proof. □

Appendix B.3. Gumbel Copula: QMD and Bounded Fisher Information

The bivariate Gumbel copula has parameter θ 1 and is defined via the generator
C θ ( u , v ) = exp ( log u ) θ + ( log v ) θ 1 / θ , ( u , v ) ( 0 , 1 ) 2 .
Fix a compact interval Θ 0 = [ θ , θ + ] with 1 < θ < θ + < . (Restricting away from the boundary point θ = 1 avoids technicalities associated with the independence limit.)
Let x = log u , y = log v , and A θ ( x , y ) = x θ + y θ . The density c θ has a known closed form; we do not need to reproduce it fully, only confirming that c θ ( u , v ) is smooth in θ for θ > 1 and ( u , v ) ( 0 , 1 ) 2 , and that its log-derivative admits an integrable envelope (shown below).
Proposition A3. 
For the bivariate Gumbel family and any compact, Θ 0 ( 1 , ) :
(i)  
The family { c θ : θ Θ 0 } is quadratic mean-differentiable on an open set containing Θ 0 ;
(ii) 
The Fisher information I ( θ ) = E θ [ ˙ θ ( U , V ) 2 ] is finite and continuous on Θ 0 ; hence, it satisfies 0 < I ̲ I ( θ ) I ¯ < on Θ 0 .
Consequently, Assumption (A3) holds for Gumbel on Θ 0 .
Proof. 
Step 1: an integrable envelope for the score.
Set X = log U and Y = log V . Under any copula, the marginals of U and V are uniform; hence, X and Y are i.i.d. Exp ( 1 ) marginally:
P ( X d x ) = e x 1 { x > 0 } d x , P ( Y d y ) = e y 1 { y > 0 } d y .
In particular, X and Y have finite moments of all orders, and also
E [ ( log X ) 2 ] < , E [ ( log Y ) 2 ] < ,
since 0 1 ( log x ) 2 d x < and 1 ( log x ) 2 e x d x < .
The log-density θ ( u , v ) = log c θ ( u , v ) is a smooth combination of terms built from X, Y, A θ ( X , Y ) = X θ + Y θ , and A θ ( X , Y ) 1 / θ , involving only addition, multiplication, logarithms, and powers. Differentiating with respect to θ produces terms of the schematic form
θ θ ( U , V ) = r B r ( θ ) T r ( X , Y , θ ) ,
where each T r is a product of factors of the types
log X , log Y , log A θ ( X , Y ) , X θ log X A θ ( X , Y ) , Y θ log Y A θ ( X , Y ) , log A θ ( X , Y ) θ 2 ,
and polynomially bounded functions of A θ ( X , Y ) 1 / θ and 1 / A θ ( X , Y ) . On Θ 0 ( 1 , ) , we have A θ ( X , Y ) X θ Y θ , so 0 X θ / A θ 1 and 0 Y θ / A θ 1 , and also, A θ > 0 . Moreover, log A θ ( X , Y ) log 2 + θ ( log + X + log + Y ) , and log A θ ( X , Y ) can only occur when both X and Y are small, which remains integrable because 0 1 | log x | 2 d x < .
Therefore, there exists C < , depending only on Θ 0 such that
| θ θ ( U , V ) | C 1 + | log X | + | log Y | for all θ Θ 0 .
Since ( log X ) 2 and ( log Y ) 2 have finite expectations, (A4) implies
sup θ Θ 0 E θ ( θ θ ( U , V ) ) 2 < ,
so I ( θ ) is finite on Θ 0 .
  • Step 2: Continuity and uniform bounds for I ( θ ) .
For each fixed ( u , v ) ( 0 , 1 ) 2 , θ θ ( u , v ) is continuous in θ for θ > 1 . The dominating envelope in (A4) is square-integrable and does not depend on θ . Hence, by dominated convergence, I ( θ ) = E θ [ ( θ θ ( U , V ) ) 2 ] is continuous on Θ 0 . The identifiability of the Gumbel family implies I ( θ ) > 0 for θ > 1 , so compactness gives 0 < I ̲ I ( θ ) I ¯ < on Θ 0 .
  • Step 3: QMD on Θ 0 .
Define s θ = c θ . As in the Clayton case, the smoothness of c θ in θ for θ > 1 yields
θ s θ = 1 2 ( θ θ ) s θ .
Using Step 1,
( θ s θ ) 2 d u d v = 1 4 E θ [ ( θ θ ( U , V ) ) 2 ] = 1 4 I ( θ ) < ,
uniformly on Θ 0 . The same L 2 -mean-value argument then shows θ s θ is differentiable in L 2 on Θ 0 ; hence, QMD holds by Lemma A5 with score ˙ θ = θ θ .
This proves (A3) for the bivariate Gumbel family on Θ 0 . □
Remark A1 (Why we restrict away from the boundary points θ = 0 and θ = 1 ). 
The compact restriction Θ 0 int ( Θ ) in Assumption (A3) is standard in local asymptotic theory. For Clayton, θ 0 approaches the independence copula and can create degeneracies in curvature. For Gumbel, θ 1 also approaches independence and similarly requires separate treatment. In the conditional copula setting, such boundary regimes are typically excluded by construction (e.g., via a link function and compact range restriction).

References

  1. Nelsen, R.B. An Introduction to Copulas; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
  2. Sklar, M. Fonctions de repartition à n dimensions et leurs marges. Ann. L’ISUP 1959, 8, 229–231. [Google Scholar]
  3. Acar, E.F.; Craiu, R.V.; Yao, F. Dependence calibration in conditional copulas: A nonparametric approach. Biometrics 2011, 67, 445–453. [Google Scholar] [CrossRef] [PubMed]
  4. Muia, M.N. Uniform Asymptotic Theory for Local Likelihood Estimation of Covariate-Dependent Copula Parameters. arXiv 2026, arXiv:2601.01345. [Google Scholar] [CrossRef]
  5. Patton, A.J. Modelling asymmetric exchange rate dependence. Int. Econ. Rev. 2006, 47, 527–556. [Google Scholar] [CrossRef]
  6. Abegaz, F.; Gijbels, I.; Veraverbeke, N. Semiparametric estimation of conditional copulas. J. Multivar. Anal. 2012, 110, 43–73. [Google Scholar] [CrossRef]
  7. Veraverbeke, N.; Omelka, M.; Gijbels, I. Estimation of a conditional copula and association measures. Scand. J. Stat. 2011, 38, 766–780. [Google Scholar] [CrossRef]
  8. Fan, J.; Gijbels, I. Local Polynomial Modelling and Its Applications; Chapman & Hall: London, UK, 1996; Volume 66. [Google Scholar]
  9. Gijbels, I.; Veraverbeke, N.; Omelka, M. Partial and average copulas and association measures. Electron. J. Stat. 2015, 9, 2420–2474. [Google Scholar] [CrossRef]
  10. Gijbels, I.; Omelka, M.; Pešta, M.; Veraverbeke, N. Score tests for covariate effects in conditional copulas. J. Multivar. Anal. 2017, 159, 111–133. [Google Scholar] [CrossRef]
  11. Newey, W.; McFadden, D. Large sample estimation and hypothesis testing. In Handbook of Econometrics; Elsevier: Amsterdam, The Netherlands, 1994; Volume 4, pp. 2111–2245. [Google Scholar]
  12. Gijbels, I.; Omelka, M.; Veraverbeke, N. Nonparametric testing for no covariate effects in conditional copulas. Statistics 2017, 51, 475–509. [Google Scholar] [CrossRef]
  13. Gijbels, I.; Omelka, M.; Veraverbeke, N. Estimation of a copula when a covariate affects only marginal distributions. Scand. J. Stat. 2015, 42, 1109–1126. [Google Scholar] [CrossRef]
  14. Gijbels, I.; Omelka, M.; Veraverbeke, N. Multivariate and functional covariates and conditional copulas. Electron. J. Stat. 2012, 6, 1273–1306. [Google Scholar] [CrossRef]
  15. van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes: With Applications to Statistics; Springer: Berlin/ Heidelberg, Germany, 1996. [Google Scholar]
  16. Tsybakov, A.B. Introduction to Nonparametric Estimation; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
  17. Yu, B. Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam; Springer: Berlin/Heidelberg, Germany, 1997; pp. 423–435. [Google Scholar]
  18. van der Vaart, A.W. Asymptotic Statistics; Cambridge University Press: Cambridge, UK, 2000; Volume 3. [Google Scholar]
  19. Czado, C.; Nagler, T. Vine copula based modeling. Annu. Rev. Stat. Its Appl. 2022, 9, 453–477. [Google Scholar] [CrossRef]
  20. Cheng, T.; Lesmana, N.S.; Poreddy, S.R.; Chen, K. Predictive Uncertainty Quantification for Financial DNN Using Regular Vine Copula. In Proceedings of the 6th ACM International Conference on AI in Finance, Singapore, 15–18 November 2025; pp. 873–881. [Google Scholar]
  21. Muia, M. Dependence and Mixing for Perturbations of Copula-Based Markov Chains. Ph.D. Thesis, University of Mississippi, Ann Arbor, MI, USA, 2024. [Google Scholar]
Figure 1. Curve recovery for p = 1 ( β = 2 , s = 1 ) at N = 750 : (a) Clayton and (b) Gumbel copulas. In each panel, the blue curve denotes the true calibration function θ ( y ) on Y 0 , while the orange and green curves give the Monte Carlo mean of θ ^ ( y ) under h local and the LOO-CVL bandwidth h cv , respectively. The shaded region shows the pointwise 95% confidence interval for the estimator based on h cv .
Figure 1. Curve recovery for p = 1 ( β = 2 , s = 1 ) at N = 750 : (a) Clayton and (b) Gumbel copulas. In each panel, the blue curve denotes the true calibration function θ ( y ) on Y 0 , while the orange and green curves give the Monte Carlo mean of θ ^ ( y ) under h local and the LOO-CVL bandwidth h cv , respectively. The shaded region shows the pointwise 95% confidence interval for the estimator based on h cv .
Mathematics 14 00914 g001
Figure 2. Theoretical localization scale h local and Monte Carlo mean LOO-CVL bandwidth h ¯ cv for p = 1 ( β = 2 , s = 1 ). Panel (a): Clayton; panel (b): Gumbel.
Figure 2. Theoretical localization scale h local and Monte Carlo mean LOO-CVL bandwidth h ¯ cv for p = 1 ( β = 2 , s = 1 ). Panel (a): Clayton; panel (b): Gumbel.
Mathematics 14 00914 g002
Figure 3. Monte Carlo mean sup-norm error θ ^ θ for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to h local and the orange curve to h cv . The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Figure 3. Monte Carlo mean sup-norm error θ ^ θ for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to h local and the orange curve to h cv . The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Mathematics 14 00914 g003
Figure 4. Monte Carlo mean discrete L 2 error for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to h local and the orange curve to h cv . The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Figure 4. Monte Carlo mean discrete L 2 error for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The blue curve corresponds to h local and the orange curve to h cv . The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Mathematics 14 00914 g004
Figure 5. Monte Carlo mean sup norm error under h cv (blue) and the benchmark rate r N = ( log N / N ) 2 / 5 (orange) for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Figure 5. Monte Carlo mean sup norm error under h cv (blue) and the benchmark rate r N = ( log N / N ) 2 / 5 (orange) for p = 1 ( β = 2 , s = 1 ; R = 100 ). Panel (a): Clayton; panel (b): Gumbel. The vertical axis is log-scaled and N { 50 , 100 , 250 , 500 , 750 } .
Mathematics 14 00914 g005
Table 1. Comparison of the theoretical localization scale h local and the data-driven LOO-CVL bandwidth h cv for p = 1 ( β = 2 , s = 1 ) based on R = 100 Monte Carlo replications. For each N, h local is deterministic, whereas the reported h ¯ cv and its accompanying standard deviation summarize the Monte Carlo mean and variability of the LOO-CVL bandwidth across replications.
Table 1. Comparison of the theoretical localization scale h local and the data-driven LOO-CVL bandwidth h cv for p = 1 ( β = 2 , s = 1 ) based on R = 100 Monte Carlo replications. For each N, h local is deterministic, whereas the reported h ¯ cv and its accompanying standard deviation summarize the Monte Carlo mean and variability of the LOO-CVL bandwidth across replications.
ClaytonGumbel
N h local h ¯ cv SD h ¯ cv h local h ¯ cv SD h ¯ cv
500.60070.75780.15370.60071.26570.4851
1000.54030.70750.13820.54031.14850.4007
2500.46650.62270.10460.46650.87100.3231
5000.41580.54570.10180.41580.70220.2181
7500.38830.52630.07670.38830.64330.1935
Table 2. Clayton copula ( p = 1 , β = 2 ): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete L 2 loss over Y 0 , based on R = 100 replications, comparing h local to the LOO-CVL bandwidth.
Table 2. Clayton copula ( p = 1 , β = 2 ): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete L 2 loss over Y 0 , based on R = 100 replications, comparing h local to the LOO-CVL bandwidth.
N · , local SD · , local · , cv SD · , cv
503.64322.28983.48762.3962
1002.35771.84442.00541.6909
2501.00020.55540.89600.5577
5000.65640.24600.61600.2777
7500.54190.19570.47670.2050
N L 2 , local SD L 2 , local L 2 , cv SD L 2 , cv
501.86611.74321.79811.9313
1001.01411.25490.87711.0933
2500.39320.12610.35760.1175
5000.27300.08280.25590.0953
7500.22590.05950.20420.0672
Table 3. Gumbel copula ( p = 1 , β = 2 ): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete L 2 loss over Y 0 , based on R = 100 replications, comparing h local to the LOO-CVL bandwidth.
Table 3. Gumbel copula ( p = 1 , β = 2 ): Monte Carlo mean and standard deviation (SD) of the sup-norm loss and the discrete L 2 loss over Y 0 , based on R = 100 replications, comparing h local to the LOO-CVL bandwidth.
N · , local SD · , local · , cv SD · , cv
502.47120.81931.84591.0658
1001.50300.82491.07870.8057
2500.75590.48000.67380.5623
5000.46790.15680.44340.3390
7500.37220.12910.34460.1514
N L 2 , local SD L 2 , local L 2 , cv SD L 2 , cv
500.89730.35120.73370.4290
1000.51500.23490.44240.2665
2500.28750.11830.26320.1438
5000.19670.05310.18380.0885
7500.15490.04220.14790.0510
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Muia, M.N.; Atutey, O.; Abeykoon, C.S. Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics 2026, 14, 914. https://doi.org/10.3390/math14050914

AMA Style

Muia MN, Atutey O, Abeykoon CS. Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics. 2026; 14(5):914. https://doi.org/10.3390/math14050914

Chicago/Turabian Style

Muia, Mathias Nthiani, Olivia Atutey, and Chathurika Srimali Abeykoon. 2026. "Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters" Mathematics 14, no. 5: 914. https://doi.org/10.3390/math14050914

APA Style

Muia, M. N., Atutey, O., & Abeykoon, C. S. (2026). Minimax Lower Bounds for Uniform Estimation of Covariate-Dependent Copula Parameters. Mathematics, 14(5), 914. https://doi.org/10.3390/math14050914

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop