MDPI - Publisher of Open Access Journals

33 pages, 525 KB

Open AccessArticle

Limit Theorem for Kernel Estimate of the Conditional Hazard Function with Weakly Dependent Functional Data

by Abderrahmane Belguerna, Abdelkader Rassoul, Hamza Daoudi, Zouaoui Chikr Elmezouar and Fatimah Alshahrani

Symmetry 2025, 17(10), 1777; https://doi.org/10.3390/sym17101777 - 21 Oct 2025

Viewed by 346

Abstract

This paper examines the asymptotic behavior of the conditional hazard function using kernel-based methods, with particular emphasis on functional weakly dependent data. In particular, we establish the asymptotic normality of the proposed estimator when the covariate follows a functional quasi-associated process. This contribution [...] Read more.

This paper examines the asymptotic behavior of the conditional hazard function using kernel-based methods, with particular emphasis on functional weakly dependent data. In particular, we establish the asymptotic normality of the proposed estimator when the covariate follows a functional quasi-associated process. This contribution extends the scope of nonparametric inference under weak dependence within the framework of functional data analysis. The estimator is constructed through kernel smoothing techniques inspired by the classical Nadaraya–Watson approach, and its theoretical properties are rigorously derived under appropriate regularity conditions. To evaluate its practical performance, we carried out an extensive simulation study, where finite-sample outcomes were compared with their asymptotic counterparts. The results showed the robustness and reliability of the estimator across a range of scenarios, thereby confirming the validity of the proposed limit theorem in empirical settings. Full article

(This article belongs to the Section Mathematics)

► Show Figures

Figure 1

31 pages, 4670 KB

Open AccessFeature PaperArticle

Survival Analysis as Imprecise Classification with Trainable Kernels

by Andrei Konstantinov, Lev Utkin, Vlada Efremenko, Vladimir Muliukha, Alexey Lukashin and Natalya Verbova

Mathematics 2025, 13(18), 3040; https://doi.org/10.3390/math13183040 - 21 Sep 2025

Viewed by 841

Abstract

Survival analysis is a fundamental tool for modeling time-to-event data in healthcare, engineering, and finance, where censored observations pose significant challenges. While traditional methods like the Beran estimator offer nonparametric solutions, they often struggle with the complex data structures and heavy censoring. This [...] Read more.

Survival analysis is a fundamental tool for modeling time-to-event data in healthcare, engineering, and finance, where censored observations pose significant challenges. While traditional methods like the Beran estimator offer nonparametric solutions, they often struggle with the complex data structures and heavy censoring. This paper introduces three novel survival models, iSurvM (imprecise Survival model based on Mean likelihood functions), iSurvQ (imprecise Survival model based on Quantiles of likelihood functions), and iSurvJ (imprecise Survival model based on Joint learning), that combine imprecise probability theory with attention mechanisms to handle censored data without parametric assumptions. The first idea behind the models is to represent censored observations by interval-valued probability distributions for each instance over time intervals between event moments. The second idea is to employ the kernel-based Nadaraya–Watson regression with trainable attention weights for computing the imprecise probability distribution over time intervals for the entire dataset. The third idea is to consider three decision strategies for training, which correspond to the proposed three models. Experiments on synthetic and real datasets demonstrate that the proposed models, especially iSurvJ, consistently outperform the Beran estimator from accuracy and computational complexity points of view. Codes implementing the proposed models are publicly available. Full article

(This article belongs to the Special Issue Advanced Neural Network and Machine Learning Algorithms, Models and Architectures in Data Mining)

► Show Figures

Figure 1

26 pages, 39229 KB

Open AccessArticle

Local–Linear Two-Stage Estimation of Local Autoregressive Geographically and Temporally Weighted Regression Model

by Dan Xiang and Zhimin Hong

ISPRS Int. J. Geo-Inf. 2025, 14(7), 276; https://doi.org/10.3390/ijgi14070276 - 16 Jul 2025

Cited by 1 | Viewed by 763

Abstract

A geographically and temporally weighted regression (GTWR) model is an effective tool for dealing with spatial heterogeneity and temporal non-stationarity simultaneously. As an important characteristic of spatiotemporal data, spatiotemporal autocorrelation should be considered when constructing spatiotemporally varying coefficient models. The proposed local autoregressive [...] Read more.

A geographically and temporally weighted regression (GTWR) model is an effective tool for dealing with spatial heterogeneity and temporal non-stationarity simultaneously. As an important characteristic of spatiotemporal data, spatiotemporal autocorrelation should be considered when constructing spatiotemporally varying coefficient models. The proposed local autoregressive geographically and temporally weighted regression (GTWRLAR) model can simultaneously handle spatiotemporal autocorrelations among response variables and the spatiotemporal heterogeneity of regression relationships. The two-stage weighted least squares (2SLS) estimation can effectively reduce computational complexity. However, the weighted least squares estimation is essentially a Nadaraya–Watson kernel-smoothing approach for nonparametric regression models, and it suffers from a boundary effect. For spatiotemporally varying coefficient models, the three-dimensional spatiotemporal coefficients (longitude, latitude, and time) inherently exhibit larger boundaries than one-dimensional intervals. Therefore, the boundary effect of the 2SLS estimation of GTWRLAR will be more serious. A local–linear geographically and temporally weighted 2SLS (GTWRLAR-L) estimation is proposed to correct the boundary effect in both the spatial and temporal dimensions of GTWRLAR and simultaneously improve parameter estimation accuracy. The simulation experiment shows that the GTWRLAR-L method reduces the root mean square error (RMSE) of parameter estimates compared to the standard GTWRLAR approach. Empirical analyses of carbon emissions in China’s Yellow River Basin (2017–2021) show that GTWRLAR-L enhances the adjusted

R^{2}

from 0.888 to 0.893. Full article

► Show Figures

Figure 1

80 pages, 858 KB

Open AccessEditor’s ChoiceArticle

Uniform in Number of Neighbor Consistency and Weak Convergence of k-Nearest Neighbor Single Index Conditional Processes and k-Nearest Neighbor Single Index Conditional U-Processes Involving Functional Mixing Data

by Salim Bouzebda

Symmetry 2024, 16(12), 1576; https://doi.org/10.3390/sym16121576 - 25 Nov 2024

Cited by 6 | Viewed by 2004

Abstract

U-statistics are fundamental in modeling statistical measures that involve responses from multiple subjects. They generalize the concept of the empirical mean of a random variable X to include summations over each m-tuple of distinct observations of X. W. Stute introduced [...] Read more.

U-statistics are fundamental in modeling statistical measures that involve responses from multiple subjects. They generalize the concept of the empirical mean of a random variable X to include summations over each m-tuple of distinct observations of X. W. Stute introduced conditional U-statistics, extending the Nadaraya–Watson estimates for regression functions. Stute demonstrated their strong pointwise consistency with the conditional expectation

r^{(m)} (φ, t)

, defined as

E [φ (Y_{1}, \dots, Y_{m}) | (X_{1}, \dots, X_{m}) = t]

for

t \in X^{m}

. This paper focuses on estimating functional single index (FSI) conditional U-processes for regular time series data. We propose a novel, automatic, and location-adaptive procedure for estimating these processes based on k-Nearest Neighbor (kNN) principles. Our asymptotic analysis includes data-driven neighbor selection, making the method highly practical. The local nature of the kNN approach improves predictive power compared to traditional kernel estimates. Additionally, we establish new uniform results in bandwidth selection for kernel estimates in FSI conditional U-processes, including almost complete convergence rates and weak convergence under general conditions. These results apply to both bounded and unbounded function classes, satisfying certain moment conditions, and are proven under standard Vapnik–Chervonenkis structural conditions and mild model assumptions. Furthermore, we demonstrate uniform consistency for the nonparametric inverse probability of censoring weighted (I.P.C.W.) estimators of the regression function under random censorship. This result is independently valuable and has potential applications in areas such as set-indexed conditional U-statistics, the Kendall rank correlation coefficient, and discrimination problems. Full article

(This article belongs to the Section Mathematics)

20 pages, 549 KB

Open AccessArticle

Estimation in Semi-Varying Coefficient Heteroscedastic Instrumental Variable Models with Missing Responses

by Weiwei Zhang, Jingxuan Luo and Shengyun Ma

Mathematics 2023, 11(23), 4853; https://doi.org/10.3390/math11234853 - 2 Dec 2023

Cited by 1 | Viewed by 1719

Abstract

This paper studies the estimation problem for semi-varying coefficient heteroscedastic instrumental variable models with missing responses. First, we propose the adjusted estimators for unknown parameters and smooth functional coefficients utilizing the ordinary profile least square method and instrumental variable adjustment technique with complete [...] Read more.

This paper studies the estimation problem for semi-varying coefficient heteroscedastic instrumental variable models with missing responses. First, we propose the adjusted estimators for unknown parameters and smooth functional coefficients utilizing the ordinary profile least square method and instrumental variable adjustment technique with complete data. Second, we present an adjusted estimator of the stochastic error variance by employing the Nadaraya–Watson kernel estimation technique. Third, we apply the inverse probability-weighted method and instrumental variable adjustment technique to construct the adaptive-weighted adjusted estimators for unknown parameters and smooth functional coefficients. The asymptotic properties of our proposed estimators are established under some regularity conditions. Finally, numerous simulation studies and a real data analysis are conducted to examine the finite sample performance of the proposed estimators. Full article

(This article belongs to the Special Issue Computational Statistics and Data Analysis, 2nd Edition)

► Show Figures

Figure 1

17 pages, 1377 KB

Open AccessArticle

A Three-Stage Nonparametric Kernel-Based Time Series Model Based on Fuzzy Data

by Gholamreza Hesamian, Arne Johannssen and Nataliya Chukhrova

Mathematics 2023, 11(13), 2800; https://doi.org/10.3390/math11132800 - 21 Jun 2023

Cited by 5 | Viewed by 1856

Abstract

In this paper, a nonlinear time series model is developed for the case when the underlying time series data are reported by

L R

fuzzy numbers. To this end, we present a three-stage nonparametric kernel-based estimation procedure for the center as well as [...] Read more.

In this paper, a nonlinear time series model is developed for the case when the underlying time series data are reported by

L R

fuzzy numbers. To this end, we present a three-stage nonparametric kernel-based estimation procedure for the center as well as the left and right spreads of the unknown nonlinear fuzzy smooth function. In each stage, the nonparametric Nadaraya–Watson estimator is used to evaluate the center and the spreads of the fuzzy smooth function. A hybrid algorithm is proposed to estimate the unknown optimal bandwidths and autoregressive order simultaneously. Various goodness-of-fit measures are utilized for performance assessment of the fuzzy nonlinear kernel-based time series model and for comparative analysis. The practical applicability and superiority of the novel approach in comparison with further fuzzy time series models are demonstrated via a simulation study and some real-life applications. Full article

(This article belongs to the Special Issue Mathematical Data Science with Applications in Business, Industry, and Medicine)

► Show Figures

Figure 1

25 pages, 2313 KB

Open AccessArticle

Heterogeneous Treatment Effect with Trained Kernels of the Nadaraya–Watson Regression

by Andrei Konstantinov, Stanislav Kirpichenko and Lev Utkin

Algorithms 2023, 16(5), 226; https://doi.org/10.3390/a16050226 - 27 Apr 2023

Cited by 1 | Viewed by 3085

Abstract

A new method for estimating the conditional average treatment effect is proposed in this paper. It is called TNW-CATE (the Trainable Nadaraya–Watson regression for CATE) and based on the assumption that the number of controls is rather large and the number of treatments [...] Read more.

A new method for estimating the conditional average treatment effect is proposed in this paper. It is called TNW-CATE (the Trainable Nadaraya–Watson regression for CATE) and based on the assumption that the number of controls is rather large and the number of treatments is small. TNW-CATE uses the Nadaraya–Watson regression for predicting outcomes of patients from control and treatment groups. The main idea behind TNW-CATE is to train kernels of the Nadaraya–Watson regression by using a weight sharing neural network of a specific form. The network is trained on controls, and it replaces standard kernels with a set of neural subnetworks with shared parameters such that every subnetwork implements the trainable kernel, but the whole network implements the Nadaraya–Watson estimator. The network memorizes how the feature vectors are located in the feature space. The proposed approach is similar to transfer learning when domains of source and target data are similar, but the tasks are different. Various numerical simulation experiments illustrate TNW-CATE and compare it with the well-known T-learner, S-learner, and X-learner for several types of control and treatment outcome functions. The code of proposed algorithms implementing TNW-CATE is publicly available. Full article

(This article belongs to the Special Issue AI, Security for Digital Health)

► Show Figures

Figure 1

10 pages, 744 KB

Open AccessArticle

Fractal Perturbation of the Nadaraya–Watson Estimator

by Dah-Chin Luor and Chiao-Wen Liu

Fractal Fract. 2022, 6(11), 680; https://doi.org/10.3390/fractalfract6110680 - 17 Nov 2022

Cited by 6 | Viewed by 2348

Abstract

One of the main tasks in the problems of machine learning and curve fitting is to develop suitable models for given data sets. It requires to generate a function to approximate the data arising from some unknown function. The class of kernel regression [...] Read more.

One of the main tasks in the problems of machine learning and curve fitting is to develop suitable models for given data sets. It requires to generate a function to approximate the data arising from some unknown function. The class of kernel regression estimators is one of main types of nonparametric curve estimations. On the other hand, fractal theory provides new technologies for making complicated irregular curves in many practical problems. In this paper, we are going to investigate fractal curve-fitting problems with the help of kernel regression estimators. For a given data set that arises from an unknown function m, one of the well-known kernel regression estimators, the Nadaraya–Watson estimator

\hat{m}

, is applied. We consider the case that m is Hölder-continuous of exponent

β

with

0 < β \leq 1

, and the graph of m is irregular. An estimation for the expectation of

| \hat{m} {- m |}^{2}

is established. Then a fractal perturbation

f_{[\hat{m}]}

corresponding to

\hat{m}

is constructed to fit the given data. The expectations of

| f_{[\hat{m}]} - \hat{m} |^{2}

and

| f_{[\hat{m}]} {- m |}^{2}

are also estimated. Full article

(This article belongs to the Special Issue Recent Advances in Fractal Interpolation Functions and Their Applications in AI)

► Show Figures

Figure 1

15 pages, 4736 KB

Open AccessArticle

Design of Computational Models for Hydroturbine Units Based on a Nonparametric Regression Approach with Adaptation by Evolutionary Algorithms

by Vladimir Viktorovich Bukhtoyarov and Vadim Sergeevich Tynchenko

Computation 2021, 9(8), 83; https://doi.org/10.3390/computation9080083 - 28 Jul 2021

Viewed by 2516

Abstract

This article deals with the problem of designing regression models for evaluating the parameters of the operation of complex technological equipment—hydroturbine units. A promising approach to the construction of regression models based on nonparametric Nadaraya–Watson kernel estimates is considered. A known problem in [...] Read more.

This article deals with the problem of designing regression models for evaluating the parameters of the operation of complex technological equipment—hydroturbine units. A promising approach to the construction of regression models based on nonparametric Nadaraya–Watson kernel estimates is considered. A known problem in applying this approach is to determine the effective values of kernel-smoothing coefficients. Kernel-smoothing factors significantly impact the accuracy of the regression model, especially under conditions of variability of noise and parameters of samples in the input space of models. This fully corresponds to the characteristics of the problem of estimating the parameters of hydraulic turbines. We propose to use the evolutionary genetic algorithm with an addition in the form of a local-search stage to adjust the smoothing coefficients. This ensures the local convergence of the tuning procedure, which is important given the high sensitivity of the quality criterion of the nonparametric model. On a set of test problems, the results were obtained showing a reduction in the modeling error by 20% and 28% for the methods of adjusting the coefficients by the standard and hybrid genetic algorithms, respectively, in comparison with the case of an arbitrary choice of the values of such coefficients. For the task of estimating the parameters of the operation of a hydroturbine unit, a number of promising approaches to constructing regression models based on artificial neural networks, multidimensional adaptive splines, and an evolutionary method of genetic programming were included in the research. The proposed nonparametric approach with a hybrid smoothing coefficient tuning scheme was found to be most effective with a reduction in modeling error of about 5% compared with the best of the alternative approaches considered in the study, which, according to the results of numerical experiments, was the method of multivariate adaptive regression splines. Full article

(This article belongs to the Section Computational Engineering)

► Show Figures

Figure 1

17 pages, 967 KB

Open AccessArticle

An Upper Bound of the Bias of Nadaraya-Watson Kernel Regression under Lipschitz Assumptions

by Samuele Tosatto, Riad Akrour and Jan Peters

Stats 2021, 4(1), 1-17; https://doi.org/10.3390/stats4010001 - 30 Dec 2020

Cited by 3 | Viewed by 6302

Abstract

The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity. Its asymptotic bias has been studied by Rosenblatt in 1969 and has been reported in several related literature. However, given its asymptotic nature, it gives no access [...] Read more.

The Nadaraya-Watson kernel estimator is among the most popular nonparameteric regression technique thanks to its simplicity. Its asymptotic bias has been studied by Rosenblatt in 1969 and has been reported in several related literature. However, given its asymptotic nature, it gives no access to a hard bound. The increasing popularity of predictive tools for automated decision-making surges the need for hard (non-probabilistic) guarantees. To alleviate this issue, we propose an upper bound of the bias which holds for finite bandwidths using Lipschitz assumptions and mitigating some of the prerequisites of Rosenblatt’s analysis. Our bound has potential applications in fields like surgical robots or self-driving cars, where some hard guarantees on the prediction-error are needed. Full article

(This article belongs to the Section Regression Models)

► Show Figures

Figure 1

4 pages, 698 KB

Open AccessExtended Abstract

Bandwidth Selection in Nonparametric Regression with Large Sample Size

by Daniel Barreiro-Ures, Ricardo Cao and Mario Francisco-Fernández

Proceedings 2018, 2(18), 1166; https://doi.org/10.3390/proceedings2181166 - 17 Sep 2018

Viewed by 2378

Abstract

In the context of nonparametric regression estimation, the behaviour of kernel methods such as the Nadaraya-Watson or local linear estimators is heavily influenced by the value of the bandwidth parameter, which determines the trade-off between bias and variance. This clearly implies that the [...] Read more.

In the context of nonparametric regression estimation, the behaviour of kernel methods such as the Nadaraya-Watson or local linear estimators is heavily influenced by the value of the bandwidth parameter, which determines the trade-off between bias and variance. This clearly implies that the selection of an optimal bandwidth, in the sense of minimizing some risk function (MSE, MISE, etc.), is a crucial issue. However, the task of estimating an optimal bandwidth using the whole sample can be very expensive in terms of computing time in the context of Big Data, due to the computational complexity of some of the most used algorithms for bandwidth selection (leave-one-out cross validation, for example, has

O (n^{2})

complexity). To overcome this problem, we propose two methods that estimate the optimal bandwidth for several subsamples of our large dataset and then extrapolate the result to the original sample size making use of the asymptotic expression of the MISE bandwidth. Preliminary simulation studies show that the proposed methods lead to a drastic reduction in computing time, while the statistical precision is only slightly decreased. Full article

(This article belongs to the Proceedings of XoveTIC Congress 2018)

► Show Figures

Figure 1

17 pages, 306 KB

Open AccessArticle

Nonparametric Regression with Common Shocks

by Eduardo A. Souza-Rodrigues

Econometrics 2016, 4(3), 36; https://doi.org/10.3390/econometrics4030036 - 1 Sep 2016

Cited by 1 | Viewed by 6819

Abstract

This paper considers a nonparametric regression model for cross-sectional data in the presence of common shocks. Common shocks are allowed to be very general in nature; they do not need to be finite dimensional with a known (small) number of factors. I investigate [...] Read more.

This paper considers a nonparametric regression model for cross-sectional data in the presence of common shocks. Common shocks are allowed to be very general in nature; they do not need to be finite dimensional with a known (small) number of factors. I investigate the properties of the Nadaraya-Watson kernel estimator and determine how general the common shocks can be while still obtaining meaningful kernel estimates. Restrictions on the common shocks are necessary because kernel estimators typically manipulate conditional densities, and conditional densities do not necessarily exist in the present case. By appealing to disintegration theory, I provide sufficient conditions for the existence of such conditional densities and show that the estimator converges in probability to the Kolmogorov conditional expectation given the sigma-field generated by the common shocks. I also establish the rate of convergence and the asymptotic distribution of the kernel estimator. Full article

27 pages, 1135 KB

Open AccessArticle

Bayesian Bandwidth Selection for a Nonparametric Regression Model with Mixed Types of Regressors

by Xibin Zhang, Maxwell L. King and Han Lin Shang

Econometrics 2016, 4(2), 24; https://doi.org/10.3390/econometrics4020024 - 22 Apr 2016

Cited by 11 | Viewed by 8250

Abstract

This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is [...] Read more.

This paper develops a sampling algorithm for bandwidth estimation in a nonparametric regression model with continuous and discrete regressors under an unknown error density. The error density is approximated by the kernel density estimator of the unobserved errors, while the regression function is estimated using the Nadaraya-Watson estimator admitting continuous and discrete regressors. We derive an approximate likelihood and posterior for bandwidth parameters, followed by a sampling algorithm. Simulation results show that the proposed approach typically leads to better accuracy of the resulting estimates than cross-validation, particularly for smaller sample sizes. This bandwidth estimation approach is applied to nonparametric regression model of the Australian All Ordinaries returns and the kernel density estimation of gross domestic product (GDP) growth rates among the organisation for economic co-operation and development (OECD) and non-OECD countries. Full article

(This article belongs to the Special Issue Nonparametric Methods in Econometrics)

► Show Figures

Figure 1

Search Results (13)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (13)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI