Next Article in Journal
Boundary Non-Crossing Probabilities as Functionals of the Deterministic Variance Clock
Previous Article in Journal
A Fixed-Point Theorem for Mappings Satisfying a Contractive Condition Involving Tρ
Previous Article in Special Issue
Robust and Non-Parametric Regression Estimators for Predictive Mean Estimation in Stratified Sampling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Robust Covariate-Dependent Kink Threshold Regression Model for Panel Data

1
School of International Business and Trade, Fujian Business University, Fuzhou 350012, China
2
School of Economic, Xiamen University, Xiamen 361005, China
3
College of Tourism, Hunan Normal University, Changsha 410081, China
4
School of Economic, Jinan University, Guangzhou 510632, China
*
Author to whom correspondence should be addressed.
Axioms 2026, 15(5), 319; https://doi.org/10.3390/axioms15050319
Submission received: 10 March 2026 / Revised: 11 April 2026 / Accepted: 26 April 2026 / Published: 28 April 2026
(This article belongs to the Special Issue Probability, Statistics and Estimations, 2nd Edition)

Abstract

This paper introduces a rank-based panel kink threshold regression model with a covariate-dependent threshold, where the threshold is specified as a function of informative covariates. To estimate the model parameters, we propose a profile estimation procedure for both the threshold parameters and regression coefficients. Additionally, we develop a Wald test statistic to examine the constancy of the threshold and a sup-score test to detect the presence of the kink effect. Through simulation studies and an empirical analysis, we demonstrate that the proposed methods exhibit robustness against outliers and heavy-tailed errors in both parameter estimation and hypothesis testing.
MSC:
62J02; 62F03; 62F35; 62F10

1. Introduction

Threshold regression models provide a powerful analytical framework for identifying heterogeneous effects, as they partition the dataset into two or more subgroups characterized by distinct regression functions based on a continuous threshold variable. In particular, the majority of existing literature focuses on jump threshold regression [1,2], where the effects of covariates are expressed as piecewise functions of the threshold variable, with discrete jumps occurring at unknown change points. However, assuming an abrupt, discontinuous shift in the regression relationship at the threshold may be unrealistic in certain practical contexts, given that covariates often induce a more gradual transition. To address this limitation, the kink threshold regression model was developed as an extension of the regression discontinuity design (e.g., [3]). This model ensures the continuity of the regression function across all points while allowing for a “kink”, a discrete change in slope at the threshold. Since its introduction, kink threshold regression has attracted significant attention in econometrics, biostatistics, and related fields and has been adapted to diverse data types, including cross-sectional data [4,5], time series data [3,6], longitudinal data [7,8], and panel data [9,10]. Motivated by its wide applicability, in this paper, we focus on panel data, which is characterized by a large cross-section of individuals or entities observed repeatedly over a relatively small, fixed number of time periods, due to its prevalence in many research domains.
A substantial body of literature has explored kink regression under the assumption of a constant threshold parameter. For instance, Ref. [3] examined the kink effect of debt on economic growth, using long-span U.S. time series data covering the period 1791–2009. The study found that elevated debt ratios would lead to a moderate slowdown in average GDP growth rates, with the estimated constant kink threshold falling in the range of 43–44%. However, the assumption of a constant, unknown threshold is often insufficient to capture the varying kink effect. For example, when studying the nonlinear effect of public debt on economic growth, inflation exerts two opposing influences: it can erode the real value of public debt and mitigate its negative impact on growth, yet it can also act as a form of sovereign default, raising risk premiums and amplifying debt’s drag on growth. This duality implies that the kink threshold for debt is not fixed but instead depends on inflation and other covariates. To accommodate such heterogeneity, Ref. [11] extended the constant kink threshold regression to a covariate-dependent kink threshold model, where the threshold is modeled as a function of informative covariates. This new model is capable of capturing such covariate-driven shifts in the debt-growth relationship. Building upon this framework, Ref. [12] further extended the model to the panel data setting, establishing the asymptotic properties of the estimators and developing F-type test statistics to examine both the constancy of the threshold and the existence of the kink effect.
All the aforementioned methods are rooted in the least squares estimation framework, which performs reasonably well under the assumption of normality. Nevertheless, in many real-world applications, outliers and heavy-tailed errors are prevalent, and neglecting such departures from normality can severely bias parameter estimates and undermine the reliability of threshold inference. Consequently, a robust estimation procedure is desirable to ensure valid and efficient statistical inference. In the context of constant kink threshold regression, Ref. [13] proposed a rank-based estimator that exhibits robustness to outliers and heavy-tailed errors while retaining high efficiency. Despite its advantages, their estimation and inference methodology cannot be directly extended to the panel covariate-dependent kink threshold regression model, thereby leaving a critical gap in the literature.
In this paper, we implement a robust statistical inference procedure for the panel kink regression model with a covariate-dependent threshold, and our contributions are threefold. First, we develop a rank-based estimation procedure by substituting the residual sum of squares in [12] with the rank dispersion function [14]. To account for unobserved heterogeneity, we draw on the approach of [15] and adopt the within-group transformation to eliminate individual fixed effects. Given that the objective function is non-differentiable and non-convex with respect to the threshold parameters, we adapt the widely used profile estimation strategy to jointly estimate the threshold parameters and regression coefficients. We further demonstrate that the slope and threshold estimators are jointly asymptotically normal with a root-nconvergence rate, owing to the continuity of the regression function with respect to the threshold parameter. Second, we design a formal testing procedure to examine whether the kink threshold is constant or covariate-dependent. Leveraging the asymptotic properties of the proposed estimators, we construct a standard Wald statistic to test the null hypothesis of threshold constancy. Third, we develop a testing procedure for the existence of the kink threshold effect, which is based on a weighted CUSUM-type statistic of subgradients. The asymptotic properties of the proposed test statistic are rigorously established under both the null and alternative hypotheses, and a simulation-based implementation procedure is outlined to facilitate the practical application of the test.
The remainder of this paper is organized as follows. Section 2 introduces the covariate-dependent panel kink threshold regression model, elaborates on the rank-based estimation for model parameters, and proposes test statistics for both threshold constancy and the presence of the threshold effect. Section 3 presents Monte Carlo simulation results aimed at evaluating the finite-sample performance of the proposed inference procedures. Section 4 provides an empirical application using a panel wage dataset to illustrate the practical utility of the proposed methodologies. Finally, Section 5 concludes the paper and discusses potential avenues for future research. All technical proofs are relegated in the Appendix A.

2. Methodology

2.1. Covariate-Dependent Panel Kink Threshold Regression Model

Consider the following panel data regression model featuring a kink (i.e., a slope change), where the threshold point is not constant but instead varies with other explanatory variables:
y i t = β 0 x i t + β 1 ( x i t γ i t ) + + β 2 T z i t + μ i + ε i t ,
for units i = 1 , 2 , , n and time periods t = 1 , 2 , , T , where y i t is the dependent variable, x i t is the primary regressor of interest, and ε i t is the idiosyncratic error term. The vector z i t contains a set of l control variables and incorporates the covariates q i t (defined in Equation (2)), which determine the threshold. The term μ i captures unobserved, time-invariant individual effects that may be correlated with other regressors. The expression ( x i t γ i t ) + = max ( x i t γ i t , 0 ) = ( x i t γ i t ) I ( x i t > γ i t ) denotes the positive part of ( x i t γ i t ) . This formulation implies that the slope of x i t is β 0 when x i t is below the threshold γ i t , and changes to β 0 + β 1 when x i t exceeds it, producing a kink at the point x i t = γ i t . Importantly, the threshold γ i t is modeled as a linear function of observable covariates q i t = ( q 1 , i t , , q k , i t ) T , thereby accommodating heterogeneity in the threshold across individuals and/or over time:
γ i t = γ 0 + γ 1 T q i t ,
where γ 0 is an intercept and γ 1 is a k × 1 vector of coefficients. Note that q i t cannot include x i t itself, as doing so would induce perfect multicollinearity.
The covariate-dependent kink threshold regression model can be viewed as a generalization of the kink regression models proposed by [3,9], as it allows the kink threshold to vary across observations. This model has also been studied by [12,16] using the least squares (LS) estimator, which relies on the assumptions of zero-mean and finite-variance errors. However, the LS estimator is well known to be highly sensitive to outliers, and its validity is compromised when the error follows a heavy-tailed distribution (e.g., the Cauchy distribution), as such distributions violate the LS assumptions and lead to unreliable estimates. This fundamental limitation motivates the need for a more robust alternative, prompting us to explore a rank-based regression framework.

2.2. Rank-Based Estimator

To enhance the robustness of the covariate-dependent kink threshold regression model (1), we adopt a rank-based estimation approach grounded in Jaeckel’s dispersion function, originally developed by [14,17].
For clarity, we first rewrite the model in a more compact form. Denote β = ( β 0 , β 1 , β 2 T ) T , γ = ( γ 0 , γ 1 T ) T , and x i t ( γ ) = ( x i t , ( x i t γ i t ) + , z i t T ) T . Model (1) can then be expressed as
y i t = β T x i t ( γ ) + μ i + ε i t .
Since the individual effects μ i are not of main interest, we eliminate them via within-individual centering, yielding
y ¨ i t = β T x ¨ i t ( γ ) + ε ¨ i t ,
where, for example, x ¨ i t ( γ ) = x i t ( γ ) T 1 t = 1 T x i t ( γ ) . To estimate the unknown parameter vector θ ( β T , γ T ) T , one can minimize the objective function
Q n ( θ ) = 1 n i = 1 n t = 1 T ϕ R ( ε ¨ i t ) n T + 1 ε ¨ i t ,
where R ( ε ¨ i t ) is the rank of ε ¨ i t = y ¨ i t β T x ¨ i t ( γ ) among { ε ¨ 11 , , ε ¨ n T } , and ϕ ( · ) is a non-decreasing, square-integrable score function defined on ( 0 , 1 ) , standardized such that ϕ ( u ) d u = 0 and ϕ ( u ) 2 d u = 1 . The choice of score function depends on the shape of the error distribution [18]. Commonly used options include the Wilcoxon score, ϕ ( t ) = 12 ( t 0.5 ) , and the sign score, ϕ ( t ) = sgn ( t 0.5 ) . The Wilcoxon score is particularly effective for symmetric, moderately heavy-tailed distributions, offering both robustness and relatively high efficiency. Therefore, we adopt the Wilcoxon score function in this study.
However, estimating the parameter vector θ is challenging because the objective function Q n ( θ ) is convex in β but non-convex in γ . Therefore, its minimizer cannot be obtained by directly minimizing (5). To address this issue, we adopt the profile estimation strategy, a widely used approach for threshold-based models [2,3,6,8,12]. Specifically, we can express the objective function (5) in the form
Q n ( β , γ ) = 1 n i = 1 n t = 1 T 12 R ( y ¨ i t β ( γ ) T x ¨ i t ( γ ) ) n T + 1 0.5 ( y ¨ i t β ( γ ) T x ¨ i t ( γ ) ) .
The minimization proceeds in two stages:
(1)
For each γ Γ , where Γ is a compact set of feasible γ values, we compute the profile estimate of β ( γ ) as β ^ ( γ ) = arg min β Q n ( β ( γ ) , γ ) .
(2)
We then estimate γ by γ ^ = arg min γ Γ Q n ( β ^ ( γ ) , γ ) . The final profiled estimator for θ is thus defined as θ ^ = ( β ^ ( γ ^ ) T , γ ^ T ) T .

2.3. Computational Details

In this subsection, we provide additional details on the numerical implementation of the proposed profile estimation procedure, which are essential for replication and practical application.
Grid construction and search region. In Step (1), we need to specify Γ = Γ 0 × Γ 1 , where Γ 0 and Γ 1 = Γ 11 × Γ 12 × × Γ 1 k are the parameter spaces for γ 0 and γ 1 , which are assumed to be compact. In applications, following [12], we define Γ 0 as [ x ( 0.15 N ) , x ( 0.85 N ) ] with x ( η ) being the η th order statistic of x i t , and set Γ 1 j as [ r max , r max ] , in which r max = max { | x ( 0.15 N ) | , | x ( 0.85 N ) | } , for j = 1 , , k . For each component of γ , we construct an evenly spaced grid over its admissible range. The resolution of the grid is chosen to balance computational feasibility and estimation accuracy. In our implementation, we use a moderately fine grid (e.g., 100 grid points per dimension), which we find sufficient to achieve stable results. As suggested by [15], selecting a threshold parameter γ that assigns an excessively small number of observations to either regime can be suboptimal. To mitigate this issue, we recommend constraining the grid search to satisfy a minimum regime size requirement, ensuring that each regime contains at least a specified percentage (e.g., 10% or 15%) of the total observations. Additionally, a robustness analysis using different percentage thresholds is advised to assess the stability and consistency of the estimation results.
Profile estimation procedure. Given a candidate value of γ , the slope parameter β is obtained by minimizing the rank-based objective function Q n ( β , γ ) , which is convex in β . This estimation procedure can be readily implemented via conventional rank-based regression methods, and all computations can be performed using widely available software packages, including the R package Rfit.
Initialization and local minima. Although the objective function is generally non-convex in γ , the grid search approach ensures a global solution within the discretized parameter space. To further guard against potential local irregularities, we conduct sensitivity checks using alternative grid initializations and confirm that the resulting estimates are stable.
Computational complexity. The computational cost of the proposed method is primarily driven by the grid search over γ and the repeated evaluation of the rank-based objective function. While the procedure is computationally efficient for low-dimensional threshold parameters, the cost increases with the dimension of q i t . Our implementation demonstrates that the method remains feasible for moderate sample sizes and low-to-moderate dimensional covariates. When the dimension of q i t increases, the iterative estimation scheme proposed by [16] provides an effective solution, particularly when initialized with a suitable starting value.
Overall, these implementation details ensure that the proposed estimation procedure is both reproducible and practically feasible while maintaining robustness to outliers and heavy-tailed errors.

2.4. Asymptotic Properties

Adopting the asymptotic framework of [12], our theoretical analysis is conducted under the setting where n and T is fixed. To derive the asymptotic distribution of the proposed rank-based estimator θ ^ , we introduce the following notations. Let F ( · ) and f ( · ) denote the cumulative distribution function and probability density function of the error term ε ¨ i t , respectively. Define the scale parameter c ϕ = { ϕ ( F ( u ) ) f ( u ) d F ( u ) } 1 . Next, let 1 i t + ( γ ) = 1 ( x i t > γ i t ) , 1 ¨ i t + ( γ ) = 1 i t + ( γ ) 1 T t = 1 T 1 i t + ( γ ) , and q ¨ i t + ( γ ) = q i t 1 i t + ( γ ) 1 T t = 1 T q i t 1 i t + ( γ ) . We define the centered error term ε ¨ i t ( θ ) = y ¨ i t β T x ¨ i t ( γ ) , and the gradient vector h i t ( θ ) = ε ¨ i t ( θ ) θ = x ¨ i t ( γ ) T , β 1 1 ¨ i t + ( γ ) , β 1 q ¨ i t + ( γ ) T . Furthermore, define the matrices
G ( θ ) = 1 c ϕ E t = 1 T h i t ( θ ) h i t ( θ ) T , Σ ( θ ) = lim n Var 1 n i = 1 n t = 1 T 12 R ( ε ¨ i t ( θ ) ) n T + 1 0.5 h i t ( θ ) .
For brevity, we denote ε ¨ i t = ε ¨ ( θ ) , h i t = h i t ( θ ) , G = G ( θ ) , and Σ = Σ ( θ ) , where θ is the true parameter value. To establish the asymptotic distribution of θ ^ , we impose the following regularity conditions.
(A1)
(i) For each t, v i t ( y i t , x i t , z i t T , q i t T ) T are independently and identically distributed (i.i.d.) across i; (ii) For some r > 0 , E | y i t | 4 + r < , E | x i t | 4 + r < , E | z i t | 4 + r < , and E | q i t | 4 + r < ; (iii) E [ ε i t | ( x i s , z i s T , q i s T , u i : 1 s T ) ] = 0 .
(A2)
The variable x i t has a conditional probability density function given q i t = q , denoted by f q , t ( x | q ) , satisfying max 1 t T f q , t ( x | q ) f ¯ q < .
(A3)
The random error ε ¨ i t has a continuous density function f ( · ) with a bounded first derivative and finite Fisher information.
(A4)
The true parameter θ = arg min ( β , γ ) B × Γ E [ Q n ( θ ) ] exists and is unique, where Θ = B × Γ is a compact subset of R k + l + 3 containing θ .
(A5)
β 1 0 .
(A6)
G ( θ ) and Σ ( θ ) are positive definite in a neighborhood of θ .
Condition (A1) imposes finite moment conditions on the response and the explanatory variables and assumes strict exogeneity of the regressors and covariates influencing the threshold. Conditions (A2)–(A5) are standard in the threshold regression literature. Condition (A2) ensures that the threshold variable x i t has a bounded and continuous density given q i t . Condition (A3) is a common assumption in rank estimation, guaranteeing smoothness and identifiability of the score function. Condition (A4) ensures that the population objective function attains a unique minimum and that the parameter space Θ is compact. Condition (A5) serves as the identification condition required for consistent estimation of θ ^ . Condition (A6) guarantees that the Hessian and variance matrices are invertible near θ , enabling derivation of the asymptotic distribution.
Theorem 1.
Under regularity conditions (A1)–(A6), as n , we have
(i)
θ ^ p θ .
(ii)
n ( θ ^ θ ) is asymptotically normally distributed with mean zero and covariance matrix G 1 Σ G 1 , i.e., n ( θ ^ θ ) D N ( 0 , G 1 Σ G 1 ) .
It is important to emphasize that, in our model framework, the regression coefficients and threshold estimators ( β ^ T , γ ^ T ) T are jointly asymptotically normal with a convergence rate of n . This property sets our model apart from conventional threshold regression models featuring a discontinuous jump, such as those considered in [19,20,21,22], where the regression coefficient estimator β ^ maintains n -consistent, but the threshold estimator γ ^ exhibits n-consistency and follows a non-standard asymptotic distribution. In contrast, the n -convergence rate of γ ^ in our case arises from the continuity of Q n ( θ ) in γ . This result is consistent with the behavior observed in conventional kink threshold regression models, as studied, for example, in [3,5,8,10].

2.5. Testing for the Threshold Constancy

The result in Theorem 1 enables valid statistical inference by providing consistent estimators of the asymptotic covariance matrix. For instance, we can test whether the kink threshold effect is constant across covariates. To this end, we consider the following hypotheses:
H 0 c : γ 1 = 0 , v . s . H 1 c : γ 1 0 .
A natural test for distinguishing between a constant threshold and the covariate-dependent threshold model (1) is based on the Wald statistic:
W n = n θ ^ T R R T G ^ 1 Σ ^ G ^ 1 R T 1 R T θ ^ ,
where R = 0 k × ( l + 3 ) , I k is the incident matrix, and G ^ = G ^ ( θ ^ ) and Σ ^ ( θ ^ ) are consistent estimators of G and Σ , respectively. In practice, these matrices can be approximated by:
G ^ = 1 c ^ ϕ 1 n i = 1 n t = 1 T h i t ( θ ^ ) h i t ( θ ^ ) T , and Σ ^ = 1 n i = 1 n t = 1 T 12 R ( ε ¨ i t ( θ ^ ) ) n T + 1 0.5 2 h i t ( θ ^ ) h i t ( θ ^ ) T .
Here, c ϕ is a scale parameter that depends on the density function f and the score function ϕ . A consistent estimator of c ϕ is required for valid inference, and we adopt the estimator proposed by [23]. The asymptotic distribution of W n under the null hypothesis is given below.
Theorem 2.
Suppose that the conditions in Theorem 1 hold. Then, under H 0 c ,
W n D χ k 2 .

2.6. Testing for the Kink Threshold Effect

Note that the proposed estimation procedure relies on the presence of a threshold effect (i.e., β 1 0 ). Thus, another important problem is whether such a threshold effect exists in the regression model (1). To address this, we consider the following null and alternative hypotheses:
H 0 l : β 1 = 0 for any γ Γ v . s . H 1 l : β 1 0 for some γ Γ .
Under H 0 l , the model reduces to a standard linear specification, and the threshold parameter γ becomes unidentifiable. Existing tests for this linearity hypothesis typically rely on either a Wald-type statistic [12] or a likelihood ratio-type statistic [24]. However, both approaches require fitting the full alternative kink threshold model under H 1 l , which can be computationally intensive, especially when the dimension of γ is large. To this end, we propose a test based on the Lagrange multiplier principle, utilizing the score process to construct a more efficient testing procedure.
The test is constructed by sequentially evaluating the subgradients of the objective function under H 0 l over a subsample, in a manner analogous to the CUSUM statistic. We define the score-based test statistic as
R n ( γ ) = 1 n i = 1 n t = 1 T 12 R ( y ¨ i t ξ ^ T w ¨ i t ) n T + 1 0.5 x ¨ i t ( γ ) c ϕ S ^ 2 n ( γ ) T S 1 n 1 w ¨ i t ,
where x ¨ i t ( γ ) = ( x i t γ i t ) + T 1 t = 1 T ( x i t γ i t ) + , w ¨ i t = w i t T 1 t = 1 T w i t with w i t = ( x i t , z i t T ) T ,
S ^ 2 n ( γ ) = n 1 i = 1 n t = 1 T 12 f ^ ( ε ¨ ^ i t l ) w ¨ i t x ¨ i t ( γ ) and S ^ 1 n = n 1 i = 1 n t = 1 T 12 f ^ ( ε ¨ ^ i t l ) w ¨ i t w ¨ i t T .
Here, ξ ^ ( β ^ 0 , β ^ 2 T ) T denotes the estimator of ξ ( β 0 , β 2 T ) T under the null hypothesis H 0 l , obtained via
ξ ^ = arg min ξ i = 1 n t = 1 T 12 R ( y ¨ i t ξ T w ¨ i t ) n T + 1 0.5 ( y ¨ i t ξ T w ¨ i t ) ,
where R ( y ¨ i t ξ T w ¨ i t ) is the rank of the residual y ¨ i t ξ T w ¨ i t among all residuals { y ¨ 11 ξ T w ¨ 11 , , y ¨ n T ξ T w ¨ n T } . Correspondingly, ε ¨ ^ i t l = y ¨ i t ξ ^ T w ¨ i t are the estimated residuals under H 0 l , and f ^ ( · ) is a kernel-based estimator of the error density f ( · ) . Importantly, our test statistic R n ( γ ) only requires estimation of the null model, as specified in (11), making it substantially more computationally efficient than alternative approaches that require fitting the full alternative model under H 1 l .
Since γ is not identified under H 0 l , we follow the union-intersection principle [25] and take the supremum of R n ( γ ) over the compact set Γ . Therefore, we propose the following test statistic
L n = sup γ Γ | R n ( γ ) | .
Intuitively, under H 0 l , ξ ^ is a consistent estimate of the true parameter value, and the estimated residuals ε ¨ ^ i t fluctuate randomly around zero. As a result, | R n ( γ ) | tends to be small across all γ Γ . In contrast, under H 1 l , ξ ^ deviates substantially from the true value, and the residuals ε ¨ ^ i t contain systematic bias, leading to large values of | R n ( γ ) | for some γ . Hence, a large value of L n provides strong evidence against H 0 l .

2.6.1. Limiting Distribution of the Test Statistic

To evaluate the power of L n , we consider the following local alternative model:
y i t = β 0 x i t + n 1 / 2 β 1 ( x i t γ i t ) + + β 2 T z i t + μ i + ε i t ,
where β 1 0 . To characterize the limiting distribution of L n , we define
S 1 n = 1 n i = 1 n t = 1 T 12 f ( ε ¨ i t ) w ¨ i t w ¨ i t , S 1 = t = 1 T E [ 12 f ( ε ¨ i t ) w ¨ i t w ¨ i t T ] S 2 n ( γ ) = 1 n i = 1 n t = 1 T 12 f ( ε ¨ i t ) w ¨ i t x ¨ i t ( γ ) , S 2 ( γ ) = t = 1 T E 12 f ( ε ¨ i t ) w ¨ i t x ¨ i t ( γ ) , S n ( γ γ ) = 1 n i = 1 n t = 1 T 12 f ( ε ¨ i t ) x ¨ i t ( γ ) x ¨ i t ( γ ) , S ( γ γ ) = t = 1 T E 12 f ( ε ¨ i t ) x ¨ i t ( γ ) x ¨ i t ( γ ) ,
and κ ( γ ) = S ( γ γ ) S 2 ( γ ) T S 1 1 S 2 ( γ ) β 1 . The following theorem shows the large-sample performance of L n under the local alternative model (12).
Theorem 3.
Suppose that the regularity conditions (A1)–(A6) hold. Under the local alternative model (12), R n ( γ ) admits the asymptotic representation
R n ( γ ) = 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t + κ ( γ ) + o p ( 1 ) .
Furthermore, as n , the test statistic L n converges weakly to the process sup γ Γ | R ( γ ) + κ ( γ ) | , where R ( γ ) is a zero-mean Gaussian process with covariance function
t = 1 T E x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t x ¨ i t ( γ ˜ ) c ϕ S 2 ( γ ˜ ) T S 1 1 w ¨ i t
for any ( γ , γ ˜ ) Γ × Γ .
Under the local alternative model (12), κ ( γ ) is generally nonzero for some γ , whereas it is identically zero for all γ under H 0 l . As implied by Theorem 3, when H 0 l holds (i.e., β 1 = 0 ), R n ( γ ) converges to the mean-zero Gaussian process R ( γ ) . In contrast, under H 1 l (i.e. , β 1 0 ), R n ( γ ) includes an additional fluctuating term κ ( γ ) , shifting its distribution away from zero. This distinction corroborates our earlier intuition regarding the behavior of R n ( γ ) . Consequently, the proposed test statistic L n can distinguish H 1 l with a covariate-dependent kink threshold effect from H 0 l with no threshold. Moreover, the power of L n approaches one when the magnitude of the threshold effect under H 1 l is of order greater than (though it can be arbitrarily close to) n 1 / 2 , as stated in the following corollary.
Corollary 1.
Suppose that the conditions in Theorem 3 hold. Under the local alternative model y i t = β 0 x i t + n 1 / 2 a n β 1 ( x i t γ i t ) + + β 2 T z i t + μ i + ε i t , for any increasing positive sequence a n , then lim n Pr ( | L n | b ) = 1 holds for any b > 0 .

2.6.2. A Bootstrap Approach to Compute the p-Value

However, the limiting null distribution of L n is nonstandard because it depends on the nuisance parameter γ . To conduct valid inference, we propose a simulation-based method for computing critical values, which leverages the asymptotic representation of R n ( γ ) in (13). Additionally, the covariance of L n involves estimation of both the CDF F ( · ) and the density function f ( · ) of errors, which complicates the analysis. Following the approach in [20] for quantile regression, we employ a kernel method to estimate f ^ ( ε ¨ ^ i t l ) by f ^ ( ε ¨ ^ i t l ) = ( n T ) 1 i = 1 n t = 1 T K h ( ε ¨ ^ i t l ε ¨ ^ i t l ) , where K h ( · ) = K ( · / h ) / h , K ( · ) is a symmetric kernel function, and h > 0 is the bandwidth. We impose an additional regularity condition to justify this estimation procedure.
(A7)
The symmetric kernel function K ( · ) satisfies K ( u ) d u = 1 and has a bounded first derivative. The bandwidth h satisfies h 0 and n h as n .
In the practical applications, the statistic R n ( γ ) defined in Algorithm 1 depends on the bandwidth h through the kernel-based estimators S ^ 2 n ( γ ) and S ^ 1 n . For bandwidth selection, we use Silverman’s rule of thumb [26], h = 1.06 σ ^ n 1 / 5 , where σ ^ is the standard deviation of the residuals ε ¨ ^ i t l under H 0 l . We summarize the bootstrap-based testing procedure in Algorithm 1.
Algorithm 1 The bootstrap-based test of L n
1:Generate iid random variables { u 1 , , u n } with u i = v i w i , where v i is drawn from N ( 0 , 1 ) , and w i (independent of all v i ’s) from Pr ( w i = 1 ) = Pr ( w i = 1 ) = 0.5 .
2:Calculate the test statistic
R n * ( γ ) = 1 n i = 1 n u i t = 1 T 12 F ^ ( ε ¨ ^ i t ) 0.5 x ¨ i t ( γ ) c ^ ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 w ¨ i t ,
where F ^ ( · ) is the empirical distribution function of ε ¨ ^ i t l under the null hypothesis,
S ^ 2 n ( γ ) = n 1 i = 1 n t = 1 T 12 f ^ ( ε ¨ ^ i t l ) w ¨ i t x ¨ i t ( γ ) , and S ^ 1 n = n 1 i = 1 n t = 1 T 12 f ^ ( ε ¨ ^ i t l ) w ¨ i t w ¨ i t T .
3:Repeat Steps 1–2 NB times to obtain L n * ( 1 ) , , L n * ( NB ) . The p-value is calculated by p ^ n = NB 1 j = 1 NB 1 { L n * ( j ) L n } .
The following result establishes the validity of the bootstrap resampling scheme, whose proof is given in Appendix A.
Theorem 4.
Suppose that regularity conditions (A1)–(A7) hold. Then, under both the null and the local alternative hypotheses, R n * ( γ ) defined in Algorithm 1 converges weakly to the Gaussian process R n ( γ ) as n .

3. Simulation Studies

In this section, we conduct simulation studies to examine the finite sample performances of the proposed estimation method and testing procedures. In particular, we evaluate the accuracy of parameter estimation, type I error, and power of the test for the presence of a kink threshold effect and the performance of the test for threshold constancy. Similar to [12], we generate data using the following data-generating processes (DGPs):
DGP 1 : y i t = x i t ( x i t γ i t ) + + 2 z i t + μ i + ε i t , DGP 2 : y i t = x i t ( x i t γ ) + + 2 z i t + μ i + ε i t , DGP 3 : y i t = x i t + 2 z i t + μ i + ε i t ,
where x i t = 0.25 μ i + u q , i t + u x , i t , z i t = 0.5 μ i + u q , i t + u z , i t , γ i t = γ 0 + γ 1 q i t , q i t = z i t , u q , i t i . i . d N ( 0.5 , 1 ) , u x , i t and u z , i t both follow i . i . d . N ( 0 , 1 ) , and μ i i . i . d N ( 0 , 1 ) . The innovation terms ε i t , u x , i t , u z , i t and u q , i t are mutually independent. DGP 1 corresponds to the covariate-dependent kink threshold regression model with true parameters ( β 0 , β 1 , β 2 ) = ( 1 , 1 , 2 ) and ( γ 0 , γ 1 ) = ( 0 , 0.5 ) . DGP 2 specifies a constant kink threshold regression model, while DGP 3 represents a standard linear regression model. DGPs 2 and 3 serve as benchmark specifications for evaluating the proposed tests. We consider four error distributions for ε i t , all standardized to have mean zero and unit variance: (i) N ( 0 , 1 ) , (ii) Student’s t-distribution with three degrees of freedom t ( 3 ) ; (iii) Tukey contaminated normal T ( 0.1 , 10 ) ( 0.9 N ( 0 , 1 ) + 0.1 N ( 0 , 10 ) ); and (iv) Lognormal distribution L N ( 0 , 1 ) . All these errors are standardized with a mean of zero and a variance of one. We use sample sizes n = 50,100 and time spans T = 10, 20. For each case, we conduct NS = 500 replications.

3.1. Estimation Accuracy

To evaluate the performance of the proposed estimation method, we compute several metrics for each parameter estimator: bias, standard deviation (SD), average estimated standard error (ESE), mean squared error (MSE), and empirical coverage probability (ECP). These results are compared with those obtained from the least squares (LS) estimator proposed by [12], abbreviated as Yang. Specifically, for the jth component of θ , denoted as θ j , Bias ( θ j ) = NS 1 m = 1 NS ( θ ^ j ( m ) θ j ) and SD ( θ j ) = NS 1 m = 1 NS ( θ ^ j ( m ) θ ¯ j ) 2 , where θ ¯ j = NS 1 m = 1 NS θ ^ j ( m ) is the average of θ ^ j , and θ ^ j ( m ) is the estimate in the mth replication, and θ j is the true parameter value. The ESE is defined as ESE ( θ ^ j ) = NS 1 m = 1 NS σ ^ j ( m ) , where σ ^ j 2 is the jth diagonal element of G ^ 1 Σ ^ G ^ 1 , as defined in (9). The ECP for θ ^ j is computed based on the 1 α Wald confidence interval,
Wald 1 α ( θ j ) = θ ^ j N 1 / 2 z 1 α / 2 σ ^ j , θ ^ j + N 1 / 2 z 1 α / 2 σ ^ j ,
where z 1 α / 2 is the ( 1 α / 2 ) -quantile of the standard normal distribution. We set α = 0.05 . The ECP is then defined as the proportion of simulations in which the true parameter value lies within the Wald-type confidence interval.
Table 1 and Table 2 summarize the estimation results, reporting the bias, SD, ESE, MSE, and ECP for each estimator. Several key findings emerge. (i) When the error term follows a standard normal distribution, both estimators perform well and exhibit comparable accuracy. Both yield estimates close to the true parameter values with negligible bias, indicating that they are effectively unbiased. Furthermore, the ESEs closely match the empirical SDs. The MSEs of Yang’s estimators are slightly smaller than those of the proposed estimators, which is expected since rank-based estimators in linear regression with normal errors achieve approximately 95% relative efficiency compared to the LS estimator [18]. (ii) Under non-normal error distributions, the proposed estimators significantly outperform the LS estimators, as evidenced by smaller SDs and MSEs. Although Yang’s method shows some improvement with increasing n or T, the gains are considerably more modest than those achieved by our method. (iii) In terms of ECP, the proposed method generally yields coverage rates closer to the nominal 95% level than Yang’s method across most settings. As n or T increases, the ECPs for the regression coefficients from both methods approach the 95% target. However, under heavy-tailed error distributions, our estimator demonstrates greater stability in maintaining valid coverage. (iv) The ECPs for the threshold parameters obtained from Yang’s method are consistently lower than those for the regression coefficients, often falling below 90% and dropping below 60% under heavy-tailed errors. This indicates that threshold parameter estimation is more sensitive to the error distribution, leading to higher variability. In contrast, the proposed method yields more stable and reliable coverage for both regression and threshold parameters. In summary, compared with Yang’s estimators, the proposed rank-based estimators offer superior robustness to outliers and heavy-tailed errors while maintaining competitive performance under normality.

3.2. Type I Error and Power Analysis

We next evaluate the finite sample size and power performance of the proposed test statistic W n for threshold constancy in Section 2.5. We focus on DGP 1 and DGP 2, accompanied by the covariate-dependent kink threshold (i.e., H 1 c holds) and the constant kink threshold (i.e., H 0 c holds), respectively. For comparison, we also include the Wald test based on the asymptotic theory of the LS estimator proposed by [12]. Table 3 reports the empirical rejection frequencies for DGPs 1 and 2 at the 5% nominal significance level. First, when the error term follows a standard normal distribution, our test maintains a Type I error rate close to the nominal level, whereas Yang’s Wald test is slightly oversized. The power of both tests is reasonably high in this setting. Second, under heavy-tailed or contaminated normal error distributions, Yang’s test becomes anti-conservative, exhibiting both inflated Type I error rates and high power. This is primarily because Yang’s method relies on the LS framework, which lacks robustness to outliers and deviations from normality. In contrast, our test preserves the nominal Type I error rate across all error distributions and maintains satisfactory power, demonstrating its superior robustness.
We also conduct a simulation study to evaluate the finite-sample performance of the test statistic L n for testing the presence of a kink threshold effect. For this purpose, we consider DGP 1 and DGP 3, which correspond to a kink threshold regression model with a covariate-dependent threshold and a standard linear regression model, respectively. To assess the performance of our proposed test, we compare it with two existing approaches: the sup-Wald test statistic proposed by [12] and the score-based test statistic introduced by [16]. Both of these tests are based on the LS estimation framework and are referred to as Yang and Zhou, respectively. Our test is the rank-based procedure developed in Section 2.6. In implementing our test, we use the Epanechnikov kernel K ( u ) = 3 / 4 ( 1 u 2 ) I ( | u | 1 ) , and select the bandwidth h = 1.06 σ ^ ( n T ) 1 / 5 .
Table 4 summarizes the empirical rejection rates of the three test statistics. The following key findings emerge. First, the size (i.e., the rejection frequency under DGP 3, the null linear model) of both our method and Zhou’s method is close to the nominal 5% level, whereas Yang’s method exhibits substantial undersizing, with rejection rates far below the nominal level. This indicates that the Wald-type test statistic proposed by Yang is highly conservative in detecting the presence of a threshold effect. Second, when the kink threshold effect is present (i.e., under DGP 1), all three testing methods perform reasonably well. When n or T is small; for example, n = 50 and T = 10 , the power of both our method and Yang’s method is close to 1, substantially exceeding that of Zhou’s method. As expected, as n or T increases, the power of Zhou’s method also approaches 1. In summary, the proposed test statistic L n demonstrates satisfactory size control and competitive power across various sample configurations, confirming its effectiveness in finite-sample settings.

4. An Empirical Application

4.1. Data and Model Specification

Understanding the relationship between female wage income and working hours is a long-standing topic in labor economics [27,28,29,30], among many others. Empirical studies in this field have suggested that the relationship between female wage income and working hours may be nonlinear. For example, Ref. [31] employed a panel threshold regression model to document a positive but nonlinear relationship between weekly hours worked and hourly wage growth for women. Ref. [32] developed a dynamic model showing that the wage–hour relationship is nonlinear due to occupational sorting and labor market constraints, which result in heterogeneous returns across the hours distribution. This nonlinearity is also found to differ significantly by gender, with women generally experiencing lower returns than men in the upper range of the hours distribution.
In this paper, we apply the proposed robust covariate-dependent kink threshold regression model to capture the nonlinear relationship between female wage income and working hours. The panel wage dataset we use is sourced from [33], originally collected through the National Longitudinal Surveys (NLS) conducted by the U.S. Department of Labor. This public dataset is available in the R package PoEdata as the data file “nls_panel.dat”. Our sample consists of 716 female respondents who were surveyed over five waves. Notably, Ref. [16] analyzed the same dataset using the LS estimator and confirmed the presence of a covariate-dependent kink effect. Following their work, we specify the following econometric model:
lwage i t = μ i + β 0 exper i t + β 1 ( exper i t γ i t ) + + β 2 hours i t + ε i t , γ i t = γ 0 + γ 1 hours i t , i = 1 , , 716 ; t = 1 , , 5 ,
where lwage i t is the log-transformed hourly wage (the dependent variable), exper i t denotes the total labor force experience, and hours i t denotes the usual weekly working hours. For estimation purposes, both exper i t and hours i t are standardized to lie in the unit interval [ 0 , 1 ] . The tipping point γ i t is modeled as a linear function of the informative covariate hours i t .

4.2. Estimation Results

Table 5 presents the estimation and testing results for the working model (14), obtained using both the mean regression method of [12] and the proposed rank regression approach. We begin by testing whether the kink threshold location γ i t depends on hours i t , i.e., H 0 c : γ 1 = 0 . The p-values from both Yang’s test and our proposed test are below 0.1, indicating strong evidence against the null hypothesis and confirming the presence of a covariate-dependent kink threshold effect at the 10% significance level. Next, we test for the existence of a kink effect, i.e., H 0 l : β 1 = 0 . Using the sup-Wald test statistic based on mean regression [12] and our proposed test procedure in Algorithm 1 with NB = 1000 bootstrap replications, we obtain p-values of 0.000 and 0.010, respectively. Both tests decisively reject the null hypothesis of linearity in favor of the kink threshold regression specification at the 5% significance level. Taken together, these results support the use of the panel kink threshold regression model (14) with a covariate-dependent threshold.
We now examine the parameter estimation results obtained from the LS estimator and the proposed robust rank-based estimator. Several noteworthy findings emerge. First, while work experience in years ( exper i t ) exerts a positive effect on the log wage ( lwage i t ), as indicated by β ^ 0 > 0 and β ^ 0 + β ^ 1 > 0 , the relationship between exper i t and lwage i t is nonlinear and varies across the two regimes defined by the threshold: exper i t γ i t and exper i t > γ i t . Specifically, the regression function is steeper in the first regime, with a larger marginal return β ^ 0 , and becomes flatter in the second regime, with a reduced marginal return β ^ 0 + β ^ 1 . This pattern is consistent with the findings of [16]. Second, our estimation results reveal a similar qualitative pattern to that reported by [12]. However, compared with the LS estimator, the standard errors of our robust estimator are noticeably smaller, indicating improved efficiency. This gain can be attributed to the rank-based estimation framework, which reduces sensitivity to outliers and heavy-tailed disturbances. In summary, the panel kink threshold regression model with a covariate-dependent threshold provides an effective tool for capturing nonlinear wage dynamics. Moreover, the proposed rank estimator enhances robustness while maintaining efficiency, making it well-suited for empirical analysis in labor economics.

4.3. Influence of Outliers and Sensitivity Analysis

To assess whether robustness is empirically relevant, we first examine the distributional properties of the residuals from the LS estimator. Figure 1 shows that the residuals deviate substantially from normality, exhibiting pronounced heavy tails and moderate right-skewness. This evidence is reinforced by the summary statistics in Table 6. The skewness coefficient is 0.480, indicating asymmetry, while the kurtosis reaches 17.396, far exceeding the Gaussian benchmark of 3. Moreover, both the Jarque–Bera and Shapiro–Wilk tests strongly reject normality. These findings indicate that the error distribution departs markedly from the classical assumptions underlying LS estimation, suggesting that LS may be highly sensitive to extreme observations and may yield unreliable inference.
To further investigate the impact of such observations, we conduct an influence diagnostic analysis. Figure 2 reports Cook’s distance and the residual–leverage relationship. The Cook’s distance plot shows that while most observations have negligible influence, a nontrivial subset exhibits relatively large values, indicating that influence is unevenly distributed across the sample. The residual-leverage plot further reveals that several observations combine relatively high leverage with sizable residuals, suggesting that they may exert a disproportionate effect on the estimated regression function.
To evaluate the impact of influential observations, we conduct a sensitivity analysis by re-estimating the model after trimming extreme observations. The results in Table 7 show that LS estimates vary noticeably across samples, particularly for the kink and threshold parameters, indicating substantial sensitivity to extreme data points. In contrast, the rank-based estimates remain highly stable across specifications, with only minimal changes in key parameters. This stability suggests that the proposed estimator is considerably less affected by influential observations. Taken together, these results provide strong empirical evidence that the proposed method delivers more reliable inference in the presence of heavy-tailed errors and outliers.

5. Conclusions

In this paper, we develop a rank-based estimation procedure for the panel kink threshold regression model with a covariate-dependent threshold. To address the non-differentiability of the objective function, we propose a two-stage estimation strategy for simultaneously estimating the regression coefficients and threshold parameters. We establish the joint asymptotic normality of the slope and threshold estimators, which facilitates the construction of a standard Wald-type test for threshold constancy. Additionally, we introduce a rank score-based statistic to test for the presence of a threshold effect. Extensive numerical studies demonstrate that the proposed estimators and tests perform well in finite samples, offering both robustness and reliable inference.
The asymptotic theory developed in this paper relies on a set of standard regularity conditions commonly adopted in the panel threshold literature. While these assumptions facilitate tractable theoretical analysis, it is important to clarify their scope and practical implications. First, our framework is developed under the asymptotic regime where the cross-sectional dimension n while the time dimension T remains fixed. This setting is appropriate for typical micro-panel datasets, where a large number of individuals are observed over a relatively small number of periods. However, the proposed method is not directly designed for panels with large T, where time series dependence may play a more prominent role. Extending the framework to accommodate large-T asymptotics would require additional techniques and is left for future research. Second, the model assumes cross-sectional independence across individuals. This assumption simplifies the derivation of the asymptotic distribution but may be restrictive in empirical applications where common shocks or cluster-level dependence are present. In such cases, the variance estimation procedure may need to be adjusted, for example, by incorporating cluster-robust covariance estimators. Developing a fully robust inference procedure under cross-sectional dependence remains an important direction for future work. Third, we impose a strict exogeneity condition on the regressors and threshold covariates. In particular, the explanatory variables and covariates determining the threshold are assumed to be uncorrelated with the idiosyncratic error term while this assumption is standard in the panel threshold literature, it may be violated in applications where regressors are endogenous or only weakly exogenous. Addressing endogeneity would require extending the current framework to incorporate instrumental variable techniques or control function approaches within a rank-based setting, which poses nontrivial challenges. Fourth, the theory assumes that the conditional density of the threshold variable is bounded and continuous. This condition ensures identification and stable estimation of the threshold parameters. In practice, this assumption is generally mild, but it may be violated in cases with discrete or highly concentrated covariates, in which case the performance of the estimator may be affected.

Author Contributions

Conceptualization, D.M. and Y.W.; methodology, C.W.; software, H.H.; validation, Y.L., Y.W. and H.H.; formal analysis, C.W.; investigation, D.M.; resources, C.W.; data curation, D.M.; writing—original draft preparation, D.M.; writing—review and editing, Y.W.; visualization, Y.L.; supervision, Y.W.; project administration, C.W.; funding acquisition, D.M. and C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Youth Project of the Fujian Provincial Social Science Foundation (Grant No. FJ2024C030), the Fujian Provincial Natural Science Foundation Youth Innovation Project (Grant No. 2026J008253), the Natural Science Foundation of Hunan Province (2026JJ50259), the Scientific Research Project of Education Department of Hunan Province (25A0059), the National Natural Science Foundation of China (12301344, 12571284, 72373074), and China Postdoctoral Science Foundation (2024M761153).

Data Availability Statement

All empirical analyses presented in this manuscript are conducted using the publicly available NLS-based panel wage dataset. The dataset is distributed with the R package PoEdata under the filename “nls_panel.dat”, and is freely accessible to the research community.

Conflicts of Interest

The writers affirm that they have no conflicting interests that could appear to influence the present work.

Appendix A

This appendix contains the proofs of the main theorems presented in the paper. Throughout, we denote a generic positive constant by C, which may take different values at different occurrences.

Appendix A.1. Proofs for Section 2.2

Proof of Theorem 1.
(i) We first show the consistency of θ ^ . We use the empirical notations. Denote P n as the empirical measure, and P as the probability measure. That is, P g = E g ( X ) and P n g = n 1 i = 1 n g ( X i ) for any measurable function g ( · ) .
Define e i t ( θ ) = y ¨ i t β T x ¨ i t ( γ ) = ε ¨ i t + ( β β ) T x ¨ i t ( γ ) + β T [ x ¨ i t ( γ ) x ¨ i t ( γ ) ] . The minimizer of Q n ( θ ) is equivalent to that of P n g ( θ ) with respect to θ , where
g ( θ ) = t = 1 T a ( e i t ( θ ) ) e i t ( θ ) ,
where a ( e i t ( θ ) ) = 12 R e i t ( θ ) n T + 1 0.5 . By Condition (A4), P g ( θ ) is continuous in θ and is uniquely minimized at θ . We need to show that the class of functions { g ( θ ) : θ Θ } is Glivenko-Cantelli, that is sup θ Θ | P n g ( θ ) P g ( θ ) | P 0 , as n goes to infinity. Since Θ is compact, it is easy to verify that both P n g ( θ ) and P g ( θ ) are continuous in θ . By the weak law of large numbers, we have P n g ( θ ) P P g ( θ ) pointwisely in θ . It remains to show that g ( θ ) is Lipschitz continuous in probability. Note that for any given x i t , x i t ( γ ) is continuous in γ and hence x ¨ i t ( γ ) = x i t ( γ ) T 1 s = 1 T x i s ( γ ) is also continuous in γ . Furthermore, e i t ( θ ) is continuous in θ . By triangle inequality, we have the following bound
x ¨ i t ( γ ) x i t ( γ ) + 1 T s = 1 T x i s ( γ ) z i t + | x i t | + C ¯ Γ 0 + q i t C ¯ Γ 1 + 1 T s = 1 T z i s + | x i s | + C ¯ Γ 0 + q i s C ¯ Γ 1 ,
where C ¯ Γ 0 = sup { | γ 0 | : γ 0 Γ 0 } and C ¯ Γ 1 = sup { γ 1 : γ 1 Γ 1 } . Thus, by Condition (A1),
| P g ( θ ) | t = 1 T 12 R e i t ( θ ) n T + 1 0.5 · | e i t ( θ ) | t = 1 T 12 R e i t ( θ ) n T + 1 0.5 · | y ¨ i t | + β ¯ x ¨ i t ( γ ) < ,
where β ¯ = sup { β : β B } . Moreover, for all θ ˜ and θ in Θ , we have
| P n g ( θ ˜ ) P g ( θ ) | C t = 1 T β ˜ · x ¨ i t ( γ ˜ ) x ¨ i t ( γ ˜ ) + β ˜ β · x ¨ i t ( γ ) C t = 1 T β ˜ · γ ˜ γ ˜ + β ˜ β · x ¨ i t ( γ )
for some positive constant C. Thus, by Conditions (A1) and (A4), θ and E x ¨ i t ( γ ) are bounded. Then, there exists a positive constant B n = O p ( 1 ) , such that | P n g ( θ ˜ ) P g ( θ ) | B n θ ˜ θ , for all θ ˜ and θ in Θ . Hence, the empirical process θ P n g ( θ ) is stochastically equicontinuous, which implies the uniform convergence of P n g ( θ ) to P g ( θ ) .
Given the compactness of Γ × B , and the uniqueness of the minimum true value θ by assumption, by Theorem 2.1 of [34], we have θ ^ P θ .
(ii) We next prove the asymptotic normality. Following the proof of [3], we need to verify that conditions (i)–(iv) in [35] (Section 3.2) also hold for the panel data in rank estimation. The consistency of θ ^ has been established. It remains to verify the following conditions:
Condition 2.
1 n i = 1 n t = 1 T a ( ε ¨ i t ) h i t D N ( 0 , Σ ) ;
Condition 3.
G ( θ ) is continuous in θ , and G ( θ ) = G ;
Condition 4.
ν ( θ ) = 1 n i = 1 n t = 1 T a ( ε ¨ i t ( θ ) ) h i t ( θ ) E a ( ε ¨ i t ( θ ) ) h i t ( θ ) is stochastically equicontinuous.
By the central limit theorem, Condition 2 follows. From the expression of G ( θ ) = t = 1 T E [ h i t ( θ ) h i t ( θ ) T ] , we find that the elements of G ( θ ) are quadratic in β . Hence, G ( θ ) is continuous in β . By the fact that γ enters G ( θ ) through one of the following forms: E [ x i t ( γ ) x i s ( γ ) ] , E [ x i t ( γ ) z i s T ] , E [ 1 i t ( γ ) 1 i s ( γ ) ] , E [ w i t 1 i s ( γ ) ] , and E [ q i t T w i t 1 i s ( γ ) ] . By Condition (A1), there exists a constant C satisfying E w i t 2 + r / 2 4 / ( 4 + r ) C < . Thus, by Lemma A.1 of [12] and the Hölder’s inequality, we have
E w i t 1 ( γ 1 , i t x i t γ 2 , i t ) 2 E w i t 2 + r / 2 4 / ( 4 + r ) E | 1 ( γ 1 , i t x i t γ 2 , i t ) | 1 / τ C f ¯ q | γ 02 γ 01 | + C f ¯ q γ 12 γ 11 1 / τ ,
where τ = 4 + r 4 . Therefore, E [ w i t 1 i s ( γ ) ] is continuous in γ . Likewise, we can show that E [ x i t ( γ ) x i s ( γ ) ] , E [ x i t ( γ ) z i s T ] , E [ 1 i t ( γ ) 1 i s ( γ ) ] , and E [ q i t T w i t 1 i s ( γ ) ] are also continuous in γ . Thus, G ( θ ) is continuous in θ . By evaluating θ = θ , we obtain G ( θ ) = G . Condition 3 holds.
We next establish Condition 4. Denote m i t ( θ ) = ( m i t 1 ( θ ) T , m i t 2 ( θ ) T , m i t 3 ( θ ) T ) T , in which m i t ( θ ) = a ( ε ¨ i t ( θ ) ) h i t , m i t 1 ( θ ) T = x ¨ i t ( γ ) a ( ε ¨ i t ( θ ) ) , m i t 2 ( θ ) T = β 1 1 ¨ i t + ( γ ) a ( ε ¨ i t ( θ ) ) , and m i t 3 ( θ ) T = β 1 q ¨ i t + ( γ ) a ( ε ¨ i t ( θ ) ) .
Note that the first part is linear in β , and the second and third terms are quadratic in β . It suffices to show that the stochastic equicontinuity with γ . We thus simplify notation by writing m i t ( θ ) = m i t * ( γ ) . Under Condition (A1), m i t * ( γ ) has a bounded 2 + r 2 -th moment and the envelope condition holds. For any δ , set N ( δ ) = δ 2 τ and γ · , k = [ γ 0 , k , γ 1 , k T ] T , k = 1 , , N δ , to be an equally spaced grid on Γ . Notice that the distance between the grid points is O ( 1 N δ ) . Define m i t k * = min [ m i t * ( γ · , k 1 ) , m i t * ( γ · , k ) ] and m i t k * * = max [ m i t * ( γ · , k 1 ) , m i t * ( γ · , k ) ] . Then for each γ , there exists γ · , k = [ γ 0 , k , γ 1 , k T ] T such that m i t k * m i t * ( γ ) m i t k * * . Thus [ m i t k * , m i t k * * ] brackets m i t * ( γ ) . Using the bound of E w i t 1 ( γ i t , k 1 x i t γ i t , k ) 2 , we thus have
E m i t k * * m i t k * 2 = E m i t * ( γ · , k ) m i t * ( γ · , k 1 ) 2 C f ¯ q | γ 02 γ 01 | + C f ¯ q γ 12 γ 11 1 / τ O ( N δ 1 / τ ) = O ( δ 2 ) .
It follows that N δ = δ 2 τ are the L 2 bracketing number and ln N δ = O ( | ln δ | ) is the metric entropy with bracketing for the class { m i t * ( γ ) : γ Γ } . Hence, Condition 4 holds by (2.17) of [36]. Combining these facts together, the asymptotic normality in Theorem 1 holds. □

Appendix A.2. Proofs for Section 2.5 and Section 2.6

The proof of Theorem 2 follows from the asymptotic property in Theorem 1, whose proof is thus omitted.
Recall that we consider the local alternative model y i t = β 0 x i t + n 1 / 2 β 1 ( x i t γ i t ) + + β 2 T z i t + μ i + ε i t , where β 1 = 0 corresponds to the null hypothesis. By concentrating out μ i , we have
y ¨ i t = β 0 x ¨ i t + n 1 / 2 β 1 x ¨ i t ( γ ) + β 2 T z ¨ i t + ε ¨ i t .
To prove Theorem 3, we need the following convergence results.
Lemma A1.
Under the Conditions (A1)–(A3), as n , we have
(i)
S ^ 1 n p S 1 ;
(ii)
sup γ Γ | S 2 n ( γ ) S 2 ( γ ) | p 0 ;
(iii)
sup γ Γ | S ^ 2 n ( γ ) S 2 ( γ ) | p 0 ;
(iv)
sup γ Γ | S n ( γ γ ) S ( γ γ ) | p 0 .
Proof. 
It is easy to prove (i) by applying the weak law of large numbers. For (ii), we can show that S 2 n ( γ ) p E S 2 n ( γ ) = S 2 ( γ ) for each given γ . Then the uniform convergence follows with the similar arguments used in Lemma 5 of [19]. For (iii), it is sufficient to show that sup γ Γ | S ^ 2 n ( γ ) S 2 n ( γ ) | = o p ( 1 ) . We can write
S ^ 2 n ( γ ) S 2 ( γ ) = 1 n i = 1 n t = 1 T 12 f ^ ( ε ¨ ^ i t ) f ( ε ¨ ^ i t ) w ¨ i t x ¨ i t ( γ ) + 1 n i = 1 n t = 1 T 12 f ( ε ¨ ^ i t ) f ( ε ¨ i t ) w ¨ i t x ¨ i t ( γ ) + S 2 n ( γ ) S 2 ( γ ) I 1 + I 2 + I 3 .
By the uniform convergence of the kernel density estimator, we have sup γ | I 1 | = o p ( 1 ) . For I 2 ,
| I 2 | 1 n i = 1 n t = 1 T 12 w ¨ i t x ¨ i t ( γ ) · max i , t f ( y ¨ i t ξ ^ T w ¨ i t ) f ( y ¨ i t ξ T w ¨ i t ) .
By the Condition (A3), and ξ ^ ξ = O p ( n 1 / 2 ) in the proof of Theorem 3, and the mean-value theorem, it is easy to show
max i , t f ( y ¨ i t ξ ^ T w ¨ i t ) f ( y ¨ i t ξ T w ¨ i t ) max i , t w ¨ i t · ξ ^ ξ · | f ( ζ T w ¨ i t ) | = o p ( 1 ) ,
where ζ lies in the segment between ξ ^ and ξ . Thus, sup γ | I 2 | = o p ( 1 ) . Furthermore, sup γ | I 3 | = o p ( 1 ) follows from (ii), and hence (iii) holds.
The proof of part (iv) follows a similar argument as that of part (ii) and is therefore omitted. □
Proof of Theorem 3.
Recall that
ξ ^ = ( β ^ 0 , β ^ 2 T ) T = arg min ξ i = 1 n t = 1 T 12 R ( y ¨ i t ξ T w ¨ i t ) n T + 1 0.5 ( y ¨ i t ξ T w ¨ i t ) ,
which is equivalent to solving the estimating equation
M n ( ξ ) = d d ξ i = 1 n t = 1 T 12 R ( y ¨ i t ξ T w ¨ i t ) n T + 1 0.5 ( y ¨ i t ξ T w ¨ i t ) = i = 1 n t = 1 T 12 R ( y ¨ i t ξ T w ¨ i t ) n T + 1 0.5 w ¨ i t .
Under the following local alternative model (A1), we have
M n ( ξ ) = i = 1 n t = 1 T 12 R ε ¨ i t + n 1 / 2 β 1 x ¨ i t ( γ ) n T + 1 0.5 w ¨ i t = i = 1 n t = 1 T 12 n T n T + 1 F ^ ε ¨ i t + n 1 / 2 β 1 x ¨ i t ( γ ) 0.5 w ¨ i t + o p ( 1 ) = i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 + f ( ε ¨ i t ) n 1 / 2 β 1 x ¨ i t ( γ ) w ¨ i t ,
where F ( · ) is the cumulative distribution function of ε ¨ i t and F ^ ( · ) is the corresponding estimator. The last equality in (A2) follows from Taylor expansion.
Next, by Theorem A.3.8 in [18], it yields that
n 1 / 2 M n ( ξ ^ ) = n 1 / 2 M n ( ξ ) c ϕ 1 n i = 1 n t = 1 T w ¨ i t w ¨ i t T n ( ξ ^ ξ ) + o p ( 1 ) .
Since n 1 / 2 M n ( ξ ^ ) = 0 , and combined with Lemma A1, it follows that
n ( ξ ^ ξ ) = c ϕ S 1 1 n 1 / 2 i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 w ¨ i t + c ϕ S 1 1 n 1 / 2 i = 1 n t = 1 T 12 f ( ε ¨ i t ) n 1 / 2 β 1 x ¨ i t ( γ ) w ¨ i t + o p ( 1 ) ,
where ξ = ( β 0 , β 2 T ) T is the true value of ξ .
Under the local alternative model (A1), we can rewrite R n ( γ ) as
R n ( γ ) = 1 n i = 1 n t = 1 T 12 R ( y ¨ i t ξ ^ T w ¨ i t ) n T + 1 0.5 x ¨ i t ( γ ) c ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 w ¨ i t = 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 f ( ε ¨ i t ) ( ξ ^ ξ ) T w ¨ i t + n 1 / 2 f ( ε ¨ i t ) β 1 x ¨ i t ( γ ) x ¨ i t ( γ ) 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 f ( ε ¨ i t ) ( ξ ^ ξ ) T w ¨ i t + n 1 / 2 f ( ε ¨ i t ) β 1 x ¨ i t ( γ ) c ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 w ¨ i t A 1 A 2 .
For A 1 , note that
A 1 = 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 x ¨ i t ( γ ) ( ξ ^ ξ ) T 1 n i = 1 n t = 1 T 12 f ( ε ¨ i t ) w ¨ i t x ¨ i t ( γ ) + 1 n i = 1 n t = 1 T 12 n 1 / 2 f ( ε ¨ i t ) β 1 x ¨ i t ( γ ) x ¨ i t ( γ ) = 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 x ¨ i t ( γ ) n ( ξ ^ ξ ) T S 2 ( γ ) + β 1 S ( γ γ ) + o p ( 1 ) .
For A 2 , by applying Lemma A1, we get
A 2 = c ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 w ¨ i t c ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 1 n i = 1 n t = 1 T 12 f ( ε ¨ i t ) ( ξ ^ ξ ) T w ¨ i t w ¨ i t + c ϕ S ^ 2 n ( γ ) T S ^ 1 n 1 1 n i = 1 n t = 1 T 12 n 1 / 2 f ( ε ¨ i t ) β 1 x ¨ i t ( γ ) w ¨ i t = c ϕ S 2 ( γ ) T S 1 1 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 w ¨ i t n ( ξ ^ ξ ) T S 2 ( γ ) + β 1 c ϕ S 2 ( γ ) T S 1 1 S 2 ( γ ) + o p ( 1 )
Combining (A3) and (A4) together, we have
R n ( γ ) = 1 n i = 1 n t = 1 T 12 F ( ε ¨ i t ) 0.5 x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t + κ ( γ ) + o p ( 1 ) D R ( γ ) + β 1 S ( γ γ ) c ϕ S 2 ( γ ) T S 1 1 S 2 ( γ ) ,
where the weak convergence of R n ( γ ) can be obtained by following the proofs in [37]. □
Proof of Corollary 1.
By arguments analogous to those used in the proof of Theorem 3, the desired result follows immediately. □
Proof of Theorem 4.
We divide the proof into three steps. First, we show that the covariance function of R n * ( γ ) converges to that of R ( γ ) . Define
R n * * ( γ ) = 1 n i = 1 n u i t = 1 T 12 F ( ε ¨ i t ) 0.5 x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t .
By leveraging the uniform convergence of F ^ F and c ^ ϕ c ϕ , along with the uniform convergence of S ^ 2 n ( γ ) S 2 ( γ ) established in Lemma A1, we can readily show that R n * ( γ ) and R n * * ( γ ) are asymptotically equivalent in the sense that sup γ R n * ( γ ) R n * * ( γ ) = o p ( 1 ) . Note that u i ’s are independent of ( y i t , x i t , z i t , q i t ) , and E u i = 0 , Var ( u i ) = 1 . Then, for any γ and γ ˜ , the covariance function of R n * * ( γ ) is
Cov R n * * ( γ ) , R n * * ( γ ^ ) = 1 n i = 1 n E ( u i 2 t = 1 T 12 F ( ε ¨ i t ) 0.5 2 x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t × x ¨ i t ( γ ˜ ) c ϕ S 2 ( γ ˜ ) T S 1 1 w ¨ i t ) = t = 1 T E x ¨ i t ( γ ) c ϕ S 2 ( γ ) T S 1 1 w ¨ i t · x ¨ i t ( γ ˜ ) c ϕ S 2 ( γ ˜ ) T S 1 1 w ¨ i t
which is the same as the covariance of R ( γ ) .
Second, it is straightforward to show that any finite-dimensional projection of R n * ( γ ) converges to that of R ( γ ) , by the central limit theorem.
Third, R n ( γ ) is uniformly tight. Note that the class of all indicator functions I ( x i t γ i t ) constitutes a Vapnik-Chervonenkis (VC) class of functions. Consequently, the class of functions
F n = t = 1 T x ¨ i t ( γ ) c ϕ S 2 ( γ ) S 1 1 w ¨ i t : γ Γ
is also a VC class. Thus, by appealing to the equicontinuity lemma (Lemma 15) from [38], one can establish that R n * ( γ ) is uniformly tight. Finally, by the Cramér-Wold device, the proof of Theorem 4 is completed. □

References

  1. Tong, H. Non-Linear Time Series: A Dynamical System Approach; Oxford University Press: Oxford, UK, 1990. [Google Scholar]
  2. Hansen, B.E. Sample Splitting and Threshold Estimation. Econometrica 2000, 68, 575–603. [Google Scholar] [CrossRef]
  3. Hansen, B.E. Regression kink with an unknown threshold. J. Bus. Econ. Stat. 2017, 35, 228–240. [Google Scholar] [CrossRef]
  4. Card, D.; Mas, A.; Rothstein, J. Tipping and the Dynamics of Segregation. Q. J. Econ. 2008, 123, 177–218. [Google Scholar] [CrossRef]
  5. Zhong, W.; Wan, C.; Zhang, W. Estimation and inference for multi-kink quantile regression. J. Bus. Econ. Stat. 2022, 40, 1123–1139. [Google Scholar] [CrossRef]
  6. Zhang, F.; Xie, R.; Xiao, Z. Time series quantile regression kink with an unknown threshold. Econom. Rev. 2025, 44, 1275–1320. [Google Scholar] [CrossRef]
  7. Das, R.; Banerjee, M.; Nan, B.; Zheng, H. Fast estimation of regression parameters in a broken-stick model for longitudinal data. J. Am. Stat. Assoc. 2016, 111, 1132–1143. [Google Scholar] [CrossRef]
  8. Wan, C.; Zhong, W.; Zhang, W.; Zou, C. Multikink quantile regression for longitudinal data with application to progesterone data analysis. Biometrics 2023, 79, 747–760. [Google Scholar] [CrossRef]
  9. Zhang, Y.; Zhou, Q.; Jiang, L. Panel kink regression with an unknown threshold. Econ. Lett. 2017, 157, 116–121. [Google Scholar] [CrossRef]
  10. Sun, Y.; Wan, C.; Zhang, W.; Zhong, W. A Multi-Kink quantile regression model with common structure for panel data analysis. J. Econom. 2024, 239, 105304. [Google Scholar] [CrossRef]
  11. Yang, L.; Su, J.J. Debt and growth: Is there a constant tipping point? J. Int. Money Financ. 2018, 87, 133–143. [Google Scholar] [CrossRef]
  12. Yang, L.; Zhang, C.; Lee, C.; Chen, I.P. Panel kink threshold regression model with a covariate-dependent threshold. Econom. J. 2021, 24, 462–481. [Google Scholar] [CrossRef]
  13. Zhang, F.; Li, Q. Robust bent line regression. J. Stat. Plan. Inference 2017, 185, 41–55. [Google Scholar] [CrossRef] [PubMed]
  14. Jaeckel, L.A. Estimating Regression Coefficients by Minimizing the Dispersion of the Residuals. Ann. Math. Stat. 1972, 43, 1449–1458. [Google Scholar] [CrossRef]
  15. Hansen, B.E. Threshold effects in non-dynamic panels: Estimation, testing, and inference. J. Econom. 1999, 93, 345–368. [Google Scholar] [CrossRef]
  16. Zhou, M.; Ye, F.; Li, Y.; Liu, F.; Wan, C. A note on the covariate-dependent kink threshold regression model for panel data. Commun. Stat. Theory Methods 2025, 54, 908–920. [Google Scholar] [CrossRef]
  17. Jureckova, J. Nonparametric Estimate of Regression Coefficients. Ann. Math. Stat. 1971, 42, 1328–1338. [Google Scholar] [CrossRef]
  18. Hettmansperger, T.; McKean, J. Robust Nonparametric Statistical Methods, 2nd ed.; Robust Nonparametric Statistical Methods; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
  19. Yu, P.; Fan, X. Threshold regression with a threshold boundary. J. Bus. Econ. Stat. 2021, 39, 953–971. [Google Scholar] [CrossRef]
  20. Zhang, Y.; Wang, H.J.; Zhu, Z. Single-index thresholding in quantile regression. J. Am. Stat. Assoc. 2022, 117, 2222–2237. [Google Scholar] [CrossRef]
  21. Wei, K.; Zhu, H.; Qin, G.; Zhu, Z.; Tu, D. Multiply robust subgroup analysis based on a single-index threshold linear marginal model for longitudinal data with dropouts. Stat. Med. 2022, 41, 2822–2839. [Google Scholar] [CrossRef]
  22. Wan, C.; Zeng, H.; Zhang, W.; Zhong, W.; Zou, C. Data-driven estimation for multithreshold accelerated failure time model. Scand. J. Stat. 2025, 52, 447–468. [Google Scholar] [CrossRef]
  23. Koul, H.L.; Sievers, G.L.; McKean, J. An estimator of the scale parameter for the rank analysis of linear models under general score functions. Scand. J. Stat. 1987, 14, 131–141. [Google Scholar]
  24. Lee, S.; Seo, M.H.; Shin, Y. Testing for threshold effects in regression models. J. Am. Stat. Assoc. 2011, 106, 220–231. [Google Scholar] [CrossRef]
  25. Roy, S.N. On a heuristic method of test construction and its use in multivariate analysis. Ann. Math. Stat. 1953, 24, 220–238. [Google Scholar] [CrossRef]
  26. Silverman, B.W. Density Estimation for Statistics and Data Analysis; Routledge: London, UK, 1986. [Google Scholar]
  27. Mincer, J. Labor force participation of married women: A study of labor supply. In Aspects of Labor Economics; Princeton University Press: Princeton, NJ, USA, 1962; pp. 63–105. [Google Scholar]
  28. Heckman, J. Shadow prices, market wages, and labor supply. Econom. J. Econom. Soc. 1974, 42, 679–694. [Google Scholar] [CrossRef]
  29. Blau, F.D.; Kahn, L.M. Changes in the labor supply behavior of married women: 1980–2000. J. Labor Econ. 2007, 25, 393–438. [Google Scholar] [CrossRef]
  30. Bick, A.; Blandin, A.; Rogerson, R. Hours and wages. Q. J. Econ. 2022, 137, 1901–1962. [Google Scholar] [CrossRef]
  31. Gicheva, D. Working long hours and early career outcomes in the high-end labor market. J. Labor Econ. 2013, 31, 785–824. [Google Scholar] [CrossRef]
  32. Liu, K. Explaining the gender wage gap: Estimates from a dynamic model of job changes and hours changes. Quant. Econ. 2016, 7, 411–447. [Google Scholar] [CrossRef][Green Version]
  33. Hill, R.C.; Griffiths, W.E.; Lim, G.C. Principles of Econometrics; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
  34. Newey, W.K.; McFadden, D. Large sample estimation and hypothesis testing. Handb. Econom. 1994, 4, 2111–2245. [Google Scholar]
  35. Andrews, D.W. Empirical process methods in econometrics. Handb. Econom. 1994, 4, 2247–2294. [Google Scholar]
  36. Doukhan, P.; Massart, P.; Rio, E. Invariance principles for absolutely regular empirical processes. In Proceedings of the Annales de l’IHP Probabilités et Statistiques; Institute of Mathematical Statistics: Waite Hill, OH, USA, 1995; Volume 31, pp. 393–427. [Google Scholar]
  37. Stute, W. Nonparametric model checks for regression. Ann. Stat. 1997, 25, 613–641. [Google Scholar] [CrossRef]
  38. Pollard, D. Convergence of Stochastic Processes; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
Figure 1. The Q–Q plot against the normal distribution. The red 45-degree line in the Q–Q plot is the ideal reference line, which is used to judge whether the sample data follows the theoretical normal distribution.
Figure 1. The Q–Q plot against the normal distribution. The red 45-degree line in the Q–Q plot is the ideal reference line, which is used to judge whether the sample data follows the theoretical normal distribution.
Axioms 15 00319 g001
Figure 2. Diagnostic plots for influential observations. The vertical solid lines represent the Cook’s distance for each observation, measuring the influence of each data point on the regression model. The horizontal dashed red line indicates the commonly used threshold 4 / n for identifying influential observations.
Figure 2. Diagnostic plots for influential observations. The vertical solid lines represent the Cook’s distance for each observation, measuring the influence of each data point on the regression model. The horizontal dashed red line indicates the commonly used threshold 4 / n for identifying influential observations.
Axioms 15 00319 g002
Table 1. Performance comparison between the proposed estimator and Yang’s estimator, based on 500 simulated samples generated from DGP 1 with n = 50, 100 and T = 10 , for the four error distributions.
Table 1. Performance comparison between the proposed estimator and Yang’s estimator, based on 500 simulated samples generated from DGP 1 with n = 50, 100 and T = 10 , for the four error distributions.
Errors YangProposed
β 0 β 1 β 2 γ 0 γ 1 β 0 β 1 β 2 γ 0 γ 1
n = 50 , T = 10
N ( 0 , 1 ) Bias0.015−0.028−0.0120.0000.0050.014−0.028−0.0110.0010.004
SD0.1300.1340.0870.2560.0800.1330.1370.0900.2600.082
ESE0.1020.1280.0710.1540.0610.1400.1770.0970.2130.085
MSE0.017 10.0190.0080.0650.0060.0180.0200.0080.0670.007
ECP0.8740.9240.8840.8060.9080.9560.9880.9620.9020.940
t ( 3 ) Bias0.015−0.0350.0020.007−0.0100.006−0.014−0.0030.0020.001
SD0.1570.2620.0930.2650.0890.0950.1030.0700.1730.066
ESE0.0980.1320.0660.1520.0600.1060.1360.0730.1700.067
MSE0.0250.0700.0090.0700.0080.0090.0110.0050.0300.004
ECP0.8980.9300.8680.8220.9240.9540.9800.9480.9260.976
T ( 0.1 , 10 ) Bias0.028−0.027−0.015−0.0340.0010.0030.000−0.006−0.0090.006
SD0.1270.1330.0900.2370.0830.0550.0600.0480.0980.059
ESE0.1010.1260.0700.1530.0610.0740.0940.0520.1170.047
MSE0.0170.0180.0080.0570.0070.0030.0040.0020.0100.004
ECP0.9100.9520.8980.8380.9100.9760.9880.9480.9680.922
L N ( 0 , 1 ) Bias0.013−0.022−0.016−0.0130.001−0.0020.002−0.003−0.0060.008
SD0.1760.1950.1250.2920.1110.0610.0700.0500.1120.059
ESE0.1020.1290.0720.1600.0610.0800.1020.0560.1300.051
MSE0.0310.0390.0160.0850.0120.0040.0050.0020.0130.004
ECP0.8920.9460.8640.8340.9040.9721.0000.9320.9600.938
n = 100 , T = 10
N ( 0 , 1 ) Bias0.013−0.017−0.0090.0010.0010.013−0.017−0.0090.0020.001
SD0.0860.0950.0630.1450.0640.0870.0970.0640.1450.065
ESE0.0710.0890.0490.1080.0430.0990.1240.0690.1520.060
MSE0.0080.0090.0040.0210.0040.0080.0100.0040.0210.004
ECP0.8920.9220.8780.8840.9600.9640.9840.9700.9580.982
t ( 3 ) Bias0.011−0.011−0.008−0.0060.0010.0010.002−0.002−0.0050.003
SD0.1860.1830.1300.1700.0660.0660.0750.0510.1170.060
ESE0.0760.0950.0530.1090.0430.0800.1020.0550.1260.049
MSE0.0350.0340.0170.0290.0040.0040.0060.0030.0140.004
ECP0.9040.9140.8800.8600.9000.9640.9820.9520.9620.984
T ( 0.1 , 10 ) Bias0.007−0.008−0.004−0.0040.003−0.0020.005−0.005−0.0080.011
SD0.0870.0980.0630.1590.0660.0390.0430.0400.0720.057
ESE0.0700.0880.0490.1090.0430.0540.0680.0370.0840.053
MSE0.0080.0100.0040.0250.0040.0010.0020.0020.0050.003
ECP0.9160.9300.8880.8460.9200.9900.9960.9420.9580.948
L N ( 0 , 1 ) Bias0.010−0.014−0.008−0.0070.0030.0000.002−0.006−0.0060.011
SD0.0810.0920.0630.1540.0670.0420.0470.0440.0830.058
ESE0.0700.0880.0490.1100.0430.0580.0730.0410.0930.056
MSE0.0070.0090.0040.0240.0050.0020.0020.0020.0070.003
ECP0.9080.9440.8800.8660.9000.9780.9920.8980.9580.942
1 The smaller MSE values are highlighted in black.
Table 2. Performance comparison between the proposed estimator and Yang’s estimator, based on 500 simulated samples generated from DGP 1 with n = 50 , 100 and T = 20 , for the four error distributions.
Table 2. Performance comparison between the proposed estimator and Yang’s estimator, based on 500 simulated samples generated from DGP 1 with n = 50 , 100 and T = 20 , for the four error distributions.
Errors YangProposed
β 0 β 1 β 2 γ 0 γ 1 β 0 β 1 β 2 γ 0 γ 1
n = 50 , T = 20
N ( 0 , 1 ) Bias0.005−0.0020.000−0.007−0.0040.004−0.0020.001−0.003−0.004
SD0.0780.0880.0600.1440.0650.0800.0910.0620.1470.066
ESE0.0720.0890.0490.1100.0440.1030.1280.0700.1590.063
MSE0.006 10.0080.0040.0210.0040.0060.0080.0040.0220.004
ECP0.9320.9540.9060.8960.9640.9920.9920.9760.9540.944
t ( 3 ) Bias0.001−0.005−0.0020.0040.003−0.0030.003−0.0010.0010.004
SD0.0870.0970.0680.1530.0660.0620.0690.0510.1090.059
ESE0.0710.0900.0480.1090.0420.0800.1010.0560.1240.049
MSE0.0080.0090.0050.0230.0040.0040.0050.0030.0120.004
ECP0.9180.9360.8700.8700.9160.9800.9900.9600.9660.920
T ( 0.1 , 10 ) Bias0.011−0.006−0.006−0.0140.0010.0010.003−0.009−0.0120.014
SD0.0800.0890.0600.1430.0660.0340.0380.0390.0680.056
ESE0.0710.0890.0490.1100.0430.0510.0640.0350.0790.052
MSE0.0070.0080.0040.0210.0040.0010.0010.0020.0050.003
ECP0.9360.9560.9100.8880.9280.9960.9960.9260.9680.942
L N ( 0 , 1 ) Bias0.007−0.012−0.002−0.0020.001−0.0040.006−0.005−0.0080.013
SD0.0800.0910.0610.1460.0640.0370.0430.0390.0720.057
ESE0.0700.0880.0480.1100.0430.0560.0710.0380.0880.049
MSE0.0060.0080.0040.0210.0040.0010.0020.0020.0050.003
ECP0.9460.9620.8860.9080.9080.9940.9960.9060.9740.912
n = 100 , T = 20
N ( 0 , 1 ) Bias0.003−0.004−0.001−0.0020.0000.004−0.004−0.002−0.0060.000
SD0.0590.0660.0490.0980.0600.0590.0650.0500.1020.060
ESE0.0500.0620.0340.0770.0300.0720.0900.0500.1110.044
MSE0.0030.0040.0020.0100.0040.0040.0040.0020.0100.004
ECP0.9040.9280.8100.8340.5760.9860.9900.9620.9880.996
t ( 3 ) Bias−0.002−0.003−0.0030.0020.006−0.003−0.001−0.0050.0000.011
SD0.0580.0660.0500.1030.0590.0460.0520.0450.0840.058
ESE0.0490.0610.0340.0770.0300.0560.0710.0390.0880.045
MSE0.0030.0040.0020.0110.0040.0020.0030.0020.0070.003
ECP0.8960.9420.8280.8160.5840.9820.9940.9160.9220.870
T ( 0.1 , 10 ) Bias0.0030.000−0.003−0.0090.004−0.0020.005−0.013−0.0150.024
SD0.0590.0650.0490.1050.0600.0270.0280.0350.0600.051
ESE0.0500.0620.0350.0770.0310.0360.0450.0250.0560.042
MSE0.0030.0040.0020.0110.0040.0010.0010.0010.0040.003
ECP0.9180.9300.8240.7980.5500.9840.9960.8840.9960.866
L N ( 0 , 1 ) Bias0.0010.000−0.005−0.0030.004−0.0020.005−0.012−0.0160.020
SD0.0560.0650.0490.1000.0600.0290.0310.0370.0630.053
ESE0.0490.0620.0340.0780.0300.0400.0500.0280.0630.044
MSE0.0030.0040.0020.0100.0040.0010.0010.0010.0040.003
ECP0.9300.9600.8300.8440.5520.9860.9960.8240.9860.912
1 The smaller MSE values are highlighted in black.
Table 3. Empirical sizes (DGP 2) and powers (DGP 1) for the test of the threshold constancy.
Table 3. Empirical sizes (DGP 2) and powers (DGP 1) for the test of the threshold constancy.
nTMethods N ( 0 , 1 ) t ( 3 ) T ( 0.1 , 10 ) LN ( 0 , 1 )
SizePowerSizePowerSizePowerSizePower
5010Yang0.0961.0000.1320.9960.1420.9960.1240.992
Proposed0.0540.9960.0460.9980.0421.0000.0521.000
20Yang0.0721.0000.0681.0000.0641.0000.1140.998
Proposed0.0481.0000.0481.0000.0521.0000.0541.000
10010Yang0.0581.0000.0640.9960.1001.0000.0921.000
Proposed0.0481.0000.0481.0000.0441.0000.0421.000
20Yang0.0521.0000.0501.0000.0641.0000.0721.000
Proposed0.0521.0000.0461.0000.0481.0000.0461.000
Table 4. Empirical sizes (DGP 3) and powers (DGP 1) for the test of the presence of threshold effect.
Table 4. Empirical sizes (DGP 3) and powers (DGP 1) for the test of the presence of threshold effect.
nTMethods N ( 0 , 1 ) t ( 3 ) T ( 0.1 , 10 ) LN ( 0 , 1 )
SizePowerSizePowerSizePowerSizePower
5010Yang0.0121.0000.0120.9960.0061.0000.0040.998
Zhou0.0340.7000.0340.7320.0380.6560.0340.722
Proposed0.0400.9120.0500.9920.0561.0000.0440.996
5020Yang0.0081.0000.0101.0000.0041.0000.0061.000
Zhou0.0481.0000.0320.9840.0280.9960.0300.988
Proposed0.0481.0000.0421.0000.0381.0000.0561.000
10010Yang0.0061.0000.0101.0000.0041.0000.0101.000
Zhou0.0520.9980.0360.9860.0400.9960.0400.995
Proposed0.0540.9980.0381.0000.0461.0000.0501.000
10020Yang0.0061.0000.0081.0000.0061.0000.0041.000
Zhou0.0381.0000.0341.0000.0181.0000.0200.998
Proposed0.0421.0000.0421.0000.0441.0000.0561.000
Table 5. Empirical analysis results for the wage dataset.
Table 5. Empirical analysis results for the wage dataset.
YangProposed
Est.s.e.Conf.int.Est.s.e.Conf.int.
β 0 1.2920.285[0.733, 1.850]1.1420.051[1.043, 1.241]
β 1 −0.3990.085[−0.565, −0.232]−0.2760.018[−0.311, −0.242]
β 2 −0.1170.189[−0.486, 0.253]−0.1820.033[−0.247, −0.116]
γ 0 0.4550.054[0.349, 0.560]0.6360.021[0.596, 0.677]
γ 1 −0.2120.125[−0.456, 0.032]−0.3330.044[−0.419, −0.248]
TestingStatisticp-valueStatisticp-value
H 0 c : γ 1 = 0 2.8910.08958.5200.000
H 0 l : β 1 = 0 6.0720.0001.7320.010
Table 6. Summary statistics for the estimated residuals obtained by the LS estimator.
Table 6. Summary statistics for the estimated residuals obtained by the LS estimator.
KurtosisSkewnessJarque–Bera Test (p-Value)Shapiro–Wilk Test (p-Value)
17.3960.4800.0000.000
Table 7. Sensitivity of estimates to influential observations
Table 7. Sensitivity of estimates to influential observations
Full SampleTrimmed Sample
LSRobustLSRobust
β 0 1.2921.1421.6731.149
(0.285)(0.051)(0.267)(0.041)
β 1 −0.399−0.276−0.483−0.273
(0.085)(0.018)(0.092)(0.013)
β 2 −0.117−0.182−0.214−0.191
(0.189)(0.033)(0.145)(0.027)
γ 0 0.4550.6360.2930.676
(0.054)(0.021)(0.040)(0.019)
γ 1 −0.212−0.333−0.151−0.343
(0.125)(0.044)(0.096)(0.036)
NO. of individuals716716688688
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ma, D.; Hong, H.; Li, Y.; Wan, C.; Wang, Y. A Robust Covariate-Dependent Kink Threshold Regression Model for Panel Data. Axioms 2026, 15, 319. https://doi.org/10.3390/axioms15050319

AMA Style

Ma D, Hong H, Li Y, Wan C, Wang Y. A Robust Covariate-Dependent Kink Threshold Regression Model for Panel Data. Axioms. 2026; 15(5):319. https://doi.org/10.3390/axioms15050319

Chicago/Turabian Style

Ma, Ding, Hengzhao Hong, Yi Li, Chuang Wan, and Yutong Wang. 2026. "A Robust Covariate-Dependent Kink Threshold Regression Model for Panel Data" Axioms 15, no. 5: 319. https://doi.org/10.3390/axioms15050319

APA Style

Ma, D., Hong, H., Li, Y., Wan, C., & Wang, Y. (2026). A Robust Covariate-Dependent Kink Threshold Regression Model for Panel Data. Axioms, 15(5), 319. https://doi.org/10.3390/axioms15050319

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop