Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case

Almulhim, Fatimah A.; Alamari, Mohammed B.; Laksaci, Ali; Kaid, Zoulikha

doi:10.3390/axioms14070537

Open AccessArticle

Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case

¹

Department of Mathematical Sciences, College of Science, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

²

Department of Mathematics, College of Science, King Khalid University, Abha 62223, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Axioms 2025, 14(7), 537; https://doi.org/10.3390/axioms14070537

Submission received: 24 May 2025 / Revised: 30 June 2025 / Accepted: 11 July 2025 / Published: 16 July 2025

(This article belongs to the Special Issue Advances in Statistical Simulation and Computing)

Download

Browse Figures

Versions Notes

Abstract

This paper introduces a new nonparametric estimator for detecting the conditional mode in the functional input variable setting. The estimator integrates a local linear approach with an

L^{1}

-robust algorithm and treats the modal regression as the minimizer of the quantile derivative. As an asymptotic result, we derive the theoretical properties of the estimator by analyzing its convergence rate under the almost complete consistency framework. The result is stated under standard conditions, characterizing both the functional structure of the data and the local linear approximation properties of the model. Moreover, the expression of the convergence rate retains the usual form of the stochastic convergence rate in functional statistics. Simulations and real-data applications demonstrate the algorithm’s effectiveness, showing its advantage over existing methods in high-dimensional prediction tasks.

Keywords:

functional statistics; conditional mode; smoothing approach; L¹ method; spectrometry processing; robust estimator; quantile regression

MSC:

62G08; 62G10; 62G35; 62G07; 62G32; 62G30; 62H12

1. Introduction

In the last decade, interest in the statistical analysis of functional data has grown significantly, driven by its wide range of applications in various domains. The foundational work in this field was first established by [1,2], as pioneer works, and [3] or [4] for recent results and references. Within this field, nonparametric statistics have proven to be highly effective in modeling functional data, particularly when it comes to predicting using conditional mode and conditional quantiles. However, most of the existing nonparametric functional quantiles or mode estimators heavily depend on traditional kernel methods. In this paper, we improve the accuracy and robustness of the previous estimators using the

L^{1}

-local linear estimation (LLE) method.

Modal and quantile regression are popular tools in nonparametric statistics for making predictions. They present an alternative approach to traditional regression, which is based on conditional expectation. Unlike the conventional approaches, the proposed predictors often impart deeper insights, especially when the data have asymmetric or multimodal conditional densities, or when the white noise shows a heavy-tailed distribution (refer to [5] for the limitation of the conditional expectation). The latter demonstrated that the conditional mode outperforms the conditional mean. Based on this, Ref. [6] created a prediction model that uses the derivative of the conditional density for multivariate data. Theoretical progress was made by [7], who established the asymptotic distribution of modal regression estimators for independent data. Ref. [8] extended the last result to the dependent cases. Further advancements in the conditional mode prediction are comprehensively covered in [9,10]. While the cited works concern the kernel estimator of the conditional mode (CM) in finite-dimensional cases, the Nadaraya–Watson (NW) estimator was extended for functional data by [11]. They estimated this regression model by maximizing the conditional density and proved its almost complete consistency. This result has been generalized to the functional time series case by [12]. The asymptotic distribution of the NW estimator of the conditional mode was explored for the independent case in [13] and for the strongly mixing case by [14]. There were also significant theoretical advancements made by [15], who determined

L^{p}

-convergence rates for NW-based functional CM estimators. For an alternative functional time series, Ref. [16] established stochastic consistency for ergodic functional time series data containing missing at random. Further theoretical and practical contributions in the field of functional CM estimation can be found in [17] and the references therein.

On the other hand, quantile regression has gained increasing interest in nonparametric data analysis. It is worth noting that the quantile regression function is also one of the most well-studied models in nonparametric vector statistics. Earlier foundational works in multivariate statistics go back to [18], and were popularized by [19,20]. Once again in functional statistics, quantile regression has received considerable attention. This high level of flexibility justifies the approach, as it can be treated as a linear, semiparametric, or nonparametric model. The functional linear modeling of the conditional quantile was considered by [21,22], which provides comprehensive overviews of the latest developments. While the semiparametric estimation was developed by [23]. Concerning nonparametric estimation, quantile regression is estimated using the functional version of the NW estimator by [11]. They proved the Borel–Cantelli of the constructed estimator. Ref. [24] has stated the moment integrability of the functional NW estimator. We return to [25,26] for recent results.

The main novelty of this work is the estimation by the LLE method. This approach provides an important advantage compared to the NW estimator. In particular, it permits increasing the efficiency of the estimation by reducing bias. This statement has been observed by [27] in the multivariate case and expanded into functional statistics by [28]. From a bibliographical perspective, the LLE technique has been thoroughly explored in functional statistics by many authors. For example, Ref. [29] demonstrated

L_{2}

consistency of the LLE estimator when estimating the regression operator in Hilbert spaces. Ref. [30] proposes a more general estimator allowing coverage of a wider range of functional covariates, while [31] developed an alternative estimator based on the inverse local covariance operator. Ref. [32] introduced functional LLE-fitting for nonparametric conditional models, establishing pointwise and uniform almost-complete (a.co.) consistency for the conditional density mode. The first results on the LLE-estimation of the quantile function were stated by [33]. Their estimation is obtained by inverting the cumulative distribution function (CDF) estimator introduced by [32]. An alternative was recently considered by [34] based on the M-estimation of the quantile regression.

While most current methods focus on estimating the conditional mode by optimizing a kernel-based conditional density estimator, this paper introduces a new LLE approach based on quantile regression. Unlike traditional methods, our estimator accumulates the robustness of

L_{1}

-regression and the performance of the LLE-estimation, which enhances its resistance to outliers and extends its accuracy as a predictor or classifier in functional statistics. Theoretically, we demonstrate the almost complete convergence of the estimator under standard conditions, with precision in the convergence rate. This new estimator’s benefits are also emphasized by comparing it to the standard one using a simulated and real data example. Our empirical analysis shows that the new estimator not only outperforms the accuracy of the previous estimators but also remains robust across a wide range of scenarios. Additionally, to demonstrate its practical value, we apply our method to predict the chemical composition of cocoa using spectrometry data. Typically, we have focused on the prediction of fat and moisture content using near-infrared spectroscopy data as functional regressors. It is well documented that near-infrared spectroscopy is a powerful, non-destructive tool for rapidly quantifying cocoa quality, allowing real-time quality control in production. Thus, combining this spectrometry processing with the proposed functional model can help chocolate producers to maintain good flavor and texture in their products, ensuring quality at every step.

The paper is organized as follows. In the next Section, we introduce the conditional mode based on the

L^{1}

rule and explain its LLE estimator. Section 3 covers the principal assumptions and the main theoretical findings. In Section 4 We outline the principal features of our approach and provide a comparative analysis with existing methods. The finite-sample performance of our proposed estimator is examined in Section 5, using both simulation studies and real-world data applications. The detailed proofs of the intermediate results are presented in the Appendix A.

2. The $L^{1}$ Conditional Mode and Its Local Linear Estimator

Let

{(F_{i}, R_{i})}_{i = 1}^{n}

be an independent and identically distributed (i.i.d.) sample from the random pair

(F, R)

, where F is functional input variable taking values in a semi-metric space

(H, m)

(m being the semi-metric) and R is the output real variable. Next, fix

F \in H

and consider a neighborhood

N_{F}

of

F

. We assume the existence of the regular conditional distribution function

D (\cdot | F)

of R given

F = F

, which is strictly increasing and has a continuous density

d (\cdot | F)

with respect to Lebesgue measure on

R

. The standard conditional mode is usually defined as the maximizer of this conditional density over a compact set S:

M R (F) = \arg \max_{y \in S} d (y | F) .

(1)

However, in order to introduce a more robust definition of the conditional mode, we reformulate this definition and we relate the modal regression to the quantile derivatives. Indeed, for

q \in D^{- 1} (S | F) = [a_{F}, b_{F}] \subset (0, 1)

, let

Q R_{q} (F)

denote the conditional quantile of order q given

F = F

. Following similar ideas to [35] in the linear case, we assume

Q R_{\cdot} (F)

is continuously differentiable (

C^{1}

) with

D (Q R_{q} (F) | F) > 0

, yielding:

Q R_{q}^{'} (F) = \frac{\partial Q R_{q} (F)}{\partial q} = \frac{1}{D (Q R_{q} (F) | F)} .

(2)

It follows that the conditional mode can be expressed as:

M R (F) = Q R_{q_{M R}} (F), where q_{M R} = \arg \min_{q \in [a_{F}, b_{F}]} Q R_{q}^{'} (F) .

(3)

Recall that the theoretical conditional quantile

Q R_{q} (F)

is robustly defined as M-regression with respect to the following score function

\min_{t} E [L_{q} (R - t) ∣ F = F],

(4)

where

L_{q} (y) = y (q - {1 I}_{{y < 0}})

is the quantile loss function. Thus, the LLE estimator of the conditional mode is strongly linked to the LLE estimator of the quantile regression. For this last point, we adopt the fast LLE approach of [30], approximating

Q R_{q} (\cdot)

in a neighborhood

N_{F}

by:

Q R_{q} (Z) \approx A + B m (F, Z) \forall Z \in N_{F},

(5)

and estimate the coefficients

(A, B)

as solutions of

\min_{(A, B) \in R^{2}} \sum_{i = 1}^{n} L_{q} (R_{i} - A - B m (F_{i}, F)) F (\frac{m (F, F_{i})}{f_{n}}),

(6)

where

F

is a kernel function,

f_{n}

is a bandwidth sequence. The conditional quantile estimator is then

{\hat{Q R}}_{q} (F) = \hat{a}

. Consequently, the LLE estimator of the conditional mode is

\hat{M R} (F) = \hat{Q R} ({\hat{q}}_{M} R | F) where {\hat{q}}_{M} R = \arg \min_{q \in [a_{F}, b_{F}]} {\hat{Q R}}^{'} (q | F),

(7)

with

{\hat{Q R}}^{'}

being the estimator of the derivative of the conditional quantile defined by

{\hat{Q R}}^{'} (q | F) = \frac{\hat{Q R} (q + b_{n} | F) - \hat{Q R} (q - b_{n} | F)}{2 b_{n}},

(8)

where

b_{n}

is a positive bandwidth-like sequence.

We point out that the functional NW estimator studied by [17] is a particular case of this study; it corresponds to

B = 0

. Next, let us clarify that the existence and uniqueness of

M R (F)

are guaranteed by the continuity and strict monotonicity of

D (\cdot | F)

, while

\hat{M R} (F)

may not be unique. In what follows,

\hat{M R} (F)

refers to any minimizer in (6). Furthermore, the primary contribution of this work is to establish asymptotic properties of

\hat{M R} (F)

when

F

takes values in the semi-metric space

F_{N}

. To our knowledge, this is the first study to employ this estimation strategy for the conditional mode, even in multivariate settings. Note that the finite-dimensional case (

F_{N} = R^{p}

) emerges as a special case, highlighting the generality of our approach.

3. Main Results

All along the paper, when no confusion is possible, we will denote by C or

C^{'}

some strictly positive generic constants, and we set

B (F, f) = \{F^{'} \in F : m (F^{'}, F) < f\}

. Now, we list some required conditions that are necessary in deriving the almost complete convergence of

\hat{M R} (F)

of

M R (F)

.

(B1): For any $r > 0$ , $α_{F} (f) : = α_{F} (- f, f) > 0$ and there exists a function $β_{F} (\cdot)$ such that:

$for all t \in (- 1, 1), \lim_{f \to 0} \frac{α_{F} (t f_{n}, f_{n})}{α_{F} (f_{n})} = β_{F} (t)$
(B2): The functions $Q R (\cdot | F)$ is of class $C^{3} ([a_{F}, b_{F}])$ and $D (\cdot | F)$ satisfies the following Lipschitz condition:

$for all (F_{1}, F_{2}) \in N_{F}, | D (t | F_{1}) - D (t | F_{2}) | \leq C d^{b} (F_{1}, F_{2}) for some b > 0,$

where $N_{F}$ denotes a neighborhood of $F$ .
(B3): The kernel $F$ is a positive and differentiable function which is supported within $(- 1, 1)$ , and such that:

$(\begin{matrix} F (1) - \int_{- 1}^{1} F^{'} (t) β_{F} (t) d t & F (1) - \int_{- 1}^{1} {(t F (t))}^{'} β_{F} (t) d t \\ F (1) - \int_{- 1}^{1} {(t F (t))}^{'} β_{F} (t) d t & F (1) - \int_{- 1}^{1} {(t^{2} F (t))}^{'} β_{F} (t) d t \end{matrix})$

is a positive definite matrix.
(B4): The bandwidth $f_{n}$ satisfies:

$\lim_{n \to \infty} \frac{\log n}{n b_{n} α_{F} (f_{n})} = 0$

The assumed conditions are not very restrictive. They allow us to analyze various aspects of our subject, such as model structure, data framework, and convergence rates. Considering the complexity of our proposed local linear algorithm and the strength of the Borel–Cantelli (BC) consistency, these assumptions offer a good compromise between generality and analytical feasibility. Specifically, each element of the subject is explored through a distinct assumption. For instance, (B1) is particularly essential in NFDA. Specifically, Condition (B1) is indispensable in NFDA, and the function

α_{F} (.)

can be explicitly given for several continuous processes (see [11]). While the regularity conditions are necessary to explore the functional space of the nonparametric path under study, this assumption significantly impacts the bias term in the convergence rate of

\hat{M R} (F)

. Finally, conditions (B3)–(B4) relate to the kernel

F

and the bandwidths

f_{n}

and

b_{n}

, which cover the technical aspects of the estimator

\hat{M R} (F)

. These conditions are less restrictive than the usual technical requirements in conditional mode estimation. Unlike the conditional density approach, which relies on two kernels, our method requires only one. It is important to recall that these conditions are necessary to determine the convergence rate of the kernel estimator under the Borel–Cantelli mode. Typically, a weaker form of consistency—distinct from BC consistency—can be established for the estimator under more restrictive assumptions. The latter is obtained by evaluating the expectation and variance of the estimator. Although this probability consistency is weaker than Borel–Cantelli consistency, it remains sufficient to justify the estimator’s practical applicability. Nevertheless, because this paper aims to demonstrate both the theoretical generality of the model and its practical viability, we have chosen to emphasize the stronger consistency result, which implies the weaker.

The following theorem establishes the almost-complete convergence (cf. [2] for details), (a.co.), of

\hat{M R} (F)

. This kind of convergence implies both almost sure convergence and convergence in probability.

Theorem 1.

Under assumptions (B1)–(B4) and if

\inf_{q \in (0, 1)} \frac{\partial^{3} Q R_{q} (F)}{\partial q} > 0

we have:

\hat{M R} (F) - M R (F) = O (f_{n}^{b / 2}) + O (b_{n}^{1 / 2}) + O ({(\frac{\log n}{n b_{n}^{2} α_{F} (f_{n})})}^{\frac{1}{4}}) a . c o .

Proof of Theorem 1.

The proof is based on some standard analytical arguments. Indeed, we write

\hat{M R} (F) - M R (F) = {\hat{Q R}}_{\hat{q_{M R}}} (F) - {Q R}_{\hat{q_{M R}}} (F) + {Q R}_{\hat{q_{M R}}} (F) - {Q R}_{q_{M R}} (F)

Trivially,

\hat{M R} (F) - M R (F) \leq \sup_{q \in [a_{F}, b_{F}]} | \hat{Q R} (q | F) - t (q | F) | + {Q R}_{\hat{q_{M R}}} (F) - {Q R}_{q_{M R}} (F)

By Taylor expansion,

{Q R}_{\hat{q_{M R}}} (F) - {Q R}_{q_{M R}} (F) = (\hat{q_{M R}} - q_{M R}) Q R_{q_{M R}^{*}}^{'} (F), q_{M R}^{*} being in (\hat{q_{M R}}, q_{M R}) .

(9)

Since

q_{M R}

is minimizer of

Q R_{\cdot}^{'} (F)

,

{Q R^{'}}_{\hat{q_{M R}}} (F) - {Q R^{'}}_{q_{M R}} (F) = {(\hat{q_{M R}} - q_{M R})}^{2} Q R_{q_{M R}^{* *}}^{'''} (F), where q_{M R}^{* *} \in (\hat{q_{M R}}, q_{M R}) .

(10)

In addition,

\begin{matrix} Q R_{\hat{q_{M R}}}^{'} (F) - Q R_{q_{M R}}^{'} (F) & = Q R_{\hat{q_{M R}}}^{'} (F) - {\hat{Q R^{'}}}_{\hat{q_{M R}}} (F) + {\hat{Q R^{'}}}_{\hat{q_{M R}}} (F) - {Q R}_{q_{M R}} (F) \\ \leq | Q R_{\hat{q_{M R}}}^{'} (F) - {\hat{Q R^{'}}}_{\hat{q_{M R}}} (F) | + | \min_{q} {\hat{Q R^{'}}}_{q} (F) - \min_{q} Q R_{q}^{'} (F) | \\ \leq 2 \sup_{q \in [a_{F}, b_{F}]} | {\hat{Q R^{'}}}_{q} (F) - Q R_{q}^{'} (F) | \end{matrix}

(11)

Thus,

| \hat{M R} (F) - M R (F) | \leq C (\sup_{q \in [a_{F}, b_{F}]} | {\hat{Q R}}_{q} (F) - Q R_{q} (F) | + \sqrt{\sup_{q \in [a_{F}, b_{F}]} | {\hat{Q R^{'}}}_{q} (F) - Q R_{q}^{'} (F) |}) .

Now, for a large enough n

\begin{matrix} {\hat{Q R^{'}}}_{q} (F) - Q R_{q}^{'} (F) & = \frac{{\hat{Q R}}_{q + b_{n}} (F) - Q R_{q + b_{n}} (F) + Q R_{q - b_{n}} (F) - {\hat{Q R}}_{q - b_{n}} (F)}{2 b_{n}} \\ + \frac{Q R_{q + b_{n}} (F) - Q R_{q} (F) + Q R_{q} (F) - Q R_{q - b_{n}} (F) - 2 b_{n} Q R_{q}^{'} (F)}{2 b_{n}} \\ \leq C b_{n}^{- 1} \sup_{q \in (a_{F} - b_{n}, b_{F} + b_{n})} | {\hat{Q R}}_{q} (F) - Q R_{q} (F) | + O (b_{n}) \end{matrix}

Thus, it suffices to evaluate, for a large enough n

\sup_{q \in [0, 1]} | {\hat{Q R}}_{q} (F) - Q R_{q} (F) | .

Thus, Theorem 1 is a consequence of the following proposition. □

Proposition 1.

Under assumptions of Theorem 1, we have

\sup_{q \in (0, 1)} | {\hat{Q R}}_{q} (F) - Q R_{q} (F) | = O (f_{n}^{b}) + O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{\frac{1}{2}})

The proof of this proposition is based on the Bahadur representation of the conditional quantile. The latter is a consequence of the following lemmas.

Lemma 1.

Let

Π_{n}

be a sequence of decreasing real random functions and

Υ_{n}

be a random real sequence such that

Υ_{n} = o_{a . c o .} (1) a n d \sup_{| ν | \leq M} | Π_{n} (ν) + λ ν - Υ_{n} | = o_{a . c o .} (1) f o r c e r t a i n c o n s t a n t s λ, M > 0 .

Then, for any real sequence

ν_{n}

such that

Π_{n} (ν_{n}) = o_{a . c o .} (1)

, we have:

\sum_{n = 1}^{\infty} I P \{| ν_{n} | \geq M\} < \infty .

4. On the Potential Impact of the Contribution

Comparison with existing approaches
In an earlier contribution [36], we have investigated robust mode estimation under a functional single-index structure, employing the local constant method. Alternatively, in this study, we examine local linear estimation for the same model under a general functional framework. Firstly, observe that the topic of the present contribution can be viewed as a generalization of [36], in the sense that the local constant constitutes a particular case of the local linear approach, and the functional single index structure is a special case of the general functional structure. Moreover, it is well documented (see, for instance, [27]) that the local linear method has many advantages over the Nadaraya–Watson (the local constant method). Particularly, the local linear estimation is usually used to reduce the bias and the boundary effect of the Nadaraya–Watson method. So, the use of the local linear method instead of the standard kernel method substantially improves the prediction accuracy. On the other hand, it is widely recognized that the single-index model reduces analytical complexity by projecting functional covariates onto a univariate index. Thus, it transforms the functional data analysis problem into a one-dimensional real data issue. This oversimplification is unusual in practice, as it ignores potentially influential higher-dimensional interactions. Our generalized framework of this contribution circumvents this limitation. From a practical point of view, the kernel estimator assumes that the nonparametric model is flat within the neighborhood of the location point, which leads to suboptimal prediction. In contrast, the local linear method assumes that the model has a linear approximation in the neighborhood of the location point, which is more realistic and improves the prediction results.
On the bias reduction
As mentioned in the previous comment, the behavior of the bias term is one of the main reasons for adopting the local linear method. Although the asymptotic behavior of the bias term is usually linked to the smoothness assumptions of the nonparametric model, this term can be significantly improved in the local linear approach. This beneficial characteristic is related to the weighting functions in the local linear estimator (see [27] in the non-functional case). A similar statement can be deduced in the functional case. Specifically, under standard conditions, the local linear approach offers a better bias term compared to the Nadaraya–Watson [36]. Indeed, this improvement is also due to the specific weighting functions implemented in ${\hat{Q R}}_{q} (F)$ . It is clear that the leading term of the Bahadur representation of ${\hat{Q R}}_{q} (F)$ such that

$I E [D_{i} (A_{3} F_{i} - A_{2} f_{n}^{- 1} D_{i} F_{i})] = 0$

where

$F_{i} = F (f_{n}^{- 1} m (F, F_{i})), and D_{i} = m (F_{i}, F)$

and

$A_{j} = α_{F}^{- 1} (f_{n}) f_{n}^{1 - j} I E [D_{i}^{j - 1} F_{i}]; j = 2, 3 .$

Using the same analytical arguments as in [37], we prove that the first part of the bias term can be reduced to $o (f_{n}^{b / 2})$ . In conclusion, although the local linear and the Nadaraya–Watson (NW) estimators share similar asymptotic properties, the local linearity of the model and the weighting functions of the LLE method allow us to improve the bias term in certain situations.
On the applicability of the estimator
Of course, the simple use of the estimator greatly depends on choosing its parameters easily. Since the estimator is derived from quantile regression, there are multiple cross-validation methods available for selecting the parameters, particularly the bandwidth parameters associated with the functional component. First, for a given subset $H_{n}$ of real positive numbers, we consider the cross-validation criterion used by [38].

$\arg \min_{f_{n} \in H_{n}} \frac{1}{n} \sum_{i = 1}^{n} {(Q_{p}^{L} (F_{i}) - {\hat{Q}}_{p}^{- i} (F_{i}))}^{2},$

where $Q_{p}^{L} (X_{i})$ denotes a local empirical quantile and ${\hat{Q}}_{p}^{- i}$ is the conditional quantile estimator at $F_{i}$ after excluding this observation from the sample. Secondly, we apply the cross-validation approach that was utilized by [39]

$\arg \min_{f_{n} \in H_{n}} \frac{1}{n} \sum_{i = 1}^{n} L_{p} (R_{i} - {\hat{Q}}_{p}^{- i} (F_{i})) .$

Additionally, various other methods can be utilized, such as the least squares cross-validation technique suggested by [6]:

$\arg \min_{f_{n} \in H_{n}} \frac{1}{n} \sum_{i = 1}^{n} {[R_{i} - \hat{M R} (F_{i})]}^{2} .$

Of course, the diversity of selection methods makes the estimator easier to implement in practice.

5. Computational Part

5.1. Simulation Study

In this section, we demonstrate the practical effectiveness of our theoretical development through some simulation experiments. Our principal goal is to show the easy insertion of the estimator in practice and to emphasize clear benefits of the proposed estimator over its competitors, such as the kernel conditional mode and the conditional density-based LLE-mode. In order to cover many practical situations, we generate data from different scenarios. Indeed, we consider two different nonparametric functional models that are:

\begin{matrix} heteroscedastic (Het .) Model : R_{i} & = & 4 \int_{0}^{1} \exp (\frac{1}{3 + F_{i}^{3} (t)}) d t + \cos (3 + F_{i}^{3} (t)) ϵ_{i}, \\ homoscedastic (Hom .) Model : R_{i} & = & 5 \int_{0}^{1} \log (\frac{2 + F_{i}^{2} (t)}{3 + F_{i}^{3} (t)}) d t + ϵ_{i}, \end{matrix}

where

ϵ_{i}

and

F_{i}

are independent. The functional input variable is also generated in two ways: smoothly and roughly. The smooth way is generated according to the following formula:

Smooth curves F_{i} (t) = a_{i} \cos (4 (b_{i} - t π)) + b_{i}

while the rough way is drawn from the following equation:

rough curves F_{i} (t) = a_{i} \cos (4 (b_{i} - t π)) + b_{i} + η_{i, t}

For both cases,

b_{i}

is distributed in

N (0, 0.7)

) and

a_{i}

is generated according to a

N (3, 4)

distribution, while the random variables

η_{i, t}

) are drawn from

N (0, 2)

. All the input curves

F_{i}

’s are discretized on the same grid represented by 100 equispaced measurements taken in the interval

(0, 1)

. The resulting functional variables are shown in Figure 1 and Figure 2.

Obviously, choosing various curve forms is essential for evaluating how the data’s functional structure affects the accuracy of estimates. Typically, the spline metric is used for modeling smooth curves, while the PCA metric is used for rough curves.

This strategy of data sampling is intentional. It allows us to test the robustness of different models and assess how the data functionality affects the estimation accuracy. We also evaluate the nonparametric model using three distributions: Normal, Weibull, and Laplace. These were chosen because they remain consistent under translation and exhibit different levels of heavy-tailed behavior, which helps us to measure how well the estimators handle outliers. In mathematical terms, the two competitors’ modal regression estimators are defined by:

The NW estimator : \tilde{M R} (F) = \arg \max_{y} \frac{\sum_{i = 1}^{n} F (f_{n}^{- 1} m (F, F_{i})) F (b_{n}^{- 1} (y - R_{i}))}{\sum_{i = 1}^{n} b_{n} F (f_{n}^{- 1} m (F, F_{i}))},

(12)

and the LLEestimator : \bar{M R} (F) = \arg \max_{y} \frac{\sum_{i, j = 1}^{n} W_{i j} F (b_{n}^{- 1} (y - R_{j}))}{\sum_{i, j = 1}^{n} b_{n} W_{i j}},

(13)

where

W_{i j} (x) = m (F_{i}, F,) (m (F_{i}, F,) - m (F_{j}, F,)) F (f_{n}^{- 1} m (F, F_{i}) F (f_{n}^{- 1} m (F, F_{j})

.

Of course, the behavior of the three estimators,

\hat{M R}

,

\tilde{M R}

, and

\bar{M R}

, as well as their straightforward insertion, is influenced by the choice of parameters, metric m, kernel

F

, and smoothing parameters

(f_{n}, b_{n})

. Evidently, the selection of the metric m is intrinsically linked to the smoothness of the functional curves. Smoother curves are better analyzed by the B-spline metric, which accommodates their regularity, while rougher curves require the PCA-metric to handle their more irregular structure. We refer to [2] for more details on the definitions of these metrics. The choice of kernel is motivated by technical considerations. In order to incorporate the assumption (B3), we have simulated using the quadratic kernel in

(- 1, 1)

. For the smoothing parameters

(f_{n}, b_{n})

, we have used the cross-validation rule based on the mean square error defined by

M S E (\ddot{M R}) = \frac{1}{n} \sum_{i = 1}^{n} {(R_{i} - \ddot{M R} (F_{i}))}^{2},

(14)

where

\ddot{M R}

represents either

\hat{M R}

,

\bar{M R}

, or

\tilde{M R}

. It follows that

(f^{o p t}, b^{o p t}) = \arg \min_{f_{n} \in H_{n}, b_{n} \in B_{n}} M S E (\ddot{M R})

H_{n}

(respectively,

B_{n}

) is the set of positive real numbers,

f_{n}

, such that the ball centered at

F

with radius

f_{n}

(respectively, the interval

[y - b_{n}, y + b_{n}]

) contains exactly k neighbors of (respectively, y). Typically, the number k is selected from the subset

{5, 10, 20, \dots, 0.5 * n}

. For our goal, we generate

n = 200

observations of

(F_{i}, R_{i})

, and we examine the robustness and the accuracy of different estimators using the MSE defined in (14). The obtained results are reported in the following table.

The performance of these estimators is strongly influenced by various components of the study, particularly the nature of the functional data and the nonparametric approach. Nevertheless, combining

L^{1}

-techniques with the LLE-smoothing method proves highly advantageous in terms of robustness and accuracy. As evidenced by the MSE results in Table 1, the variability in

\hat{M R}

is lower than that of the other estimators. This lower variability demonstrates that the

L^{1}

-LLE estimator for the conditional mode offers greater robustness and improved accuracy.

5.2. Real Data Application

Spectrometry data analysis is a powerful tool for generating high-frequency functional data. In this study, we focus on food quality as a critical real-world issue. Clearly, ensuring high-quality food is a top priority for both consumers and producers. Traditionally, quality control relies on chemical testing as an effective but destructive, time-consuming, and expensive approach. Instead, we provide a smart solution that combines spectrometry techniques with advanced statistical tools. It permits us to create an efficient, cost-effective, and non-invasive approach to ensure food quality. Specifically, we demonstrate how our new predictor can accurately predict fat and moisture contents in chocolate. Therefore, we evaluate our approach against leading alternatives, including the NW estimator, LLE-conditional expectation, and density-based LLE-mode. This computational study allows us to demonstrate the superiority and practical advantages of

\hat{M R}

in robustness as well as in accuracy. We compare these three approaches by testing their predictive performance using the near-infrared spectroscopy curve as functional input F, with the fat (or moisture) quantity as response variable R. The near-infrared data set available on the website (https://data.mendeley.com/datasets/7734j4fd98/1, (accessed on 8 May 2025)). This issue has been considered by many authors using different algorithms (see [40] for a list of references). Formally, in this real data example, we assume that the spectrum curve gives the absorbance of light across wavelengths ranging from 1000 to 2500 nm. These curves are visualized in Figure 3.

These curves represent the absorbance spectrum (NIR) in the wavelength range from 1000 to 2500 nm for a total of 72 bulk intact cocoa bean samples. Each bulk amounted to 50 g of intact beans. Recall that the choice of the response variable R is motivated by the fact that the fat or moisture quantities are determinants of cocoa quality. The quality of cocoa beans is the cornerstone of exceptional chocolate production, profoundly influencing the sensory, nutritional, and commercial value of the final product. Thus, precise determination of cocoa contents is essential for compliance with chocolate quality standards, economic valuation, and research in agriculture and nutrition. Consequently, optimizing fat and moisture percentage is primordial to delivering high-quality chocolate products that meet both consumer and industry demands. Now, to put our predictors into action, we need to carefully select the principal parameters that drive our estimator’s performance. This step is crucial to ensuring the potential feasibility of the four predictors

\hat{M R}

,

\tilde{M R}

,

\bar{M R}

, and

\bar{C E} = \frac{\sum_{i, j = 1}^{n} W_{i j} R_{j}}{\sum_{i, j = 1}^{n} b_{n} W_{i j}},

Based on our earlier discussion, the kernel selection is a direct consequence of assumption (B3), ensuring theoretical consistency. Meanwhile, the metric m plays a crucial role in the quantification of the smoothness of the functional curves

F_{i}

, allowing us to better analyze their behavior. Clearly, the curve’s shape in Figure 3 shows that the metric of

L_{2}

between the first derivatives of the curves (the curves being regular) is more suitable for this data. Once again, the smoothing parameters

(f_{n}, b_{n})

according the same strategy as in (14). This latest approach has proven highly effective in our predictive modeling, highlighting the importance of leave-one-out cross-validation data-driven algorithms in functional data analysis. Figure 4 and Figure 5 summarize our statistical predictions for fat and moisture for the 70 observations, comparing each approach’s performance by plotting predicted values against the true values.

We use (MSE) error to assess model performance for the predictiom results. It is defined by

M S E (\ddot{θ}) = \frac{1}{70} \sum_{i = 1}^{70} {(R_{i} - \ddot{M R} (F_{i}))}^{2},

where

\ddot{θ}

can represent either

\hat{M R}

,

\bar{M R}

,

\tilde{M R}

or

\bar{C E}

. Clearly, the predictor

\hat{M R}

outperforms the other predictors in precision. The latter predicts (fat, moisture) with MSE equal, respectively, to

(0.97, 1.06)

. It is the best one compared to the results obtained by the predictors

\bar{M R}

,

\tilde{M R}

, and

\bar{C E}

. Typically, we record for both components (fat, moisture)

(1.70, 2, 89)

for

\bar{M R}

,

(4.89, 7.48)

for

\tilde{M R}

and

(3.45, 6.22)

for

\bar{C E}

.

Author Contributions

The authors contributed approximately equally to this work. Formal analysis, F.A.A.; Validation, M.B.A.; Writing—review & editing, A.L. and Z.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Princess Nourah bint Abdulrahman University Researchers; Supporting Project number (PNURSP2025R515); Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia; and the Deanship of Scientific Research and Graduate Studies at King Khalid University for funding this work through Small Research Project under grant number RGP1/41/46.

Data Availability Statement

The data used in this study are available through the link https://data.mendeley.com/datasets/7734j4fd98/1 (accessed on 8 May 2025).

Acknowledgments

The authors would like to thank the Editor, the Associate-Editor and two anonymous reviewers for their valuable comments and suggestions which improved substantially the quality of an earlier version of this paper. The authors thank and extend their appreciation to the funders of this work: This work was supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R515), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia, and the Deanship of Scientific Research and Graduate Studies at King Khalid University for funding this work through Small Research Project under grant number RGP1/41/46.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Proofs of Intermediate Results

Proof of Proposition 1.

We apply Lemma 1 on

ν_{n} = (\begin{matrix} \hat{A} - A \\ f_{n} (\hat{B} - B) \end{matrix})

X_{n} (ν) = \frac{1}{n α_{F} ({\tilde{F}}_{n})} \sum_{i = 1}^{n} ζ_{q} (ν) (\begin{matrix} 1 \\ f_{n}^{- 1} D_{i} \end{matrix}) F_{i}, for ν = (\begin{matrix} c \\ d \end{matrix})

and

Υ_{n} = X_{n} (ν_{0}) with ν_{0} = (\begin{matrix} 0 \\ 0 \end{matrix})

where

ζ_{q} (ν) = q - 1 I_{R_{i} \leq (c + A) + (f_{n}^{- 1} d + B) D_{i}}, F_{i} = F (f_{n}^{- 1} m (F, F_{i})), and D_{i} = m (F_{i}, F) .

So, the main result is a consequence of the following lemmas. □

Lemma A1.

Under assumptions (B1)–(B5), we have

∥ Υ_{n} ∥ = O (f_{n}^{\min (k_{1}, k_{2})}) + O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2}) .

Proof of Lemma A1.

Firstly, we set

Z_{i}^{1} = (q - 1 I_{[R_{i} \leq A + B D_{i}]}) F_{i} - I E [(q - 1 I_{[R_{i} \leq A + B D_{i}]}) F_{i}]

and

Z_{i}^{2} = (q - 1 I_{[R_{i} \leq A + B D_{i}]}) D_{i} F_{i} - I E [(q - 1 I_{[R_{i} \leq A + B D_{i}]}) F_{i}] .

Then

Υ_{n} - I E [Υ_{n}] = (\begin{matrix} Υ_{n}^{1} \\ Υ_{n}^{2} \end{matrix})

where

\{\begin{matrix} Υ_{n}^{1} = \frac{1}{n α_{F} ({\tilde{F}}_{n})} \sum_{i = 1}^{n} Z_{i}^{1} \\ Υ_{n}^{2} = \frac{1}{n h α_{F} ({\tilde{F}}_{n})} \sum_{i = 1}^{n} Z_{i}^{2} . \end{matrix}

Obviously

| Z_{i}^{1} | \leq C and | Z_{i}^{2} | \leq C^{'} h .

Moreover,

I E {[Z_{i}^{1}]}^{2} \leq C α_{F} ({\tilde{F}}_{n}) and I E {[Z_{i}^{2}]}^{2} \leq C^{'} f_{n}^{2} α_{F} ({\tilde{F}}_{n}) .

It follows that

Υ_{n}^{1} - I E [Υ_{n}^{1}] = O_{a . c o .} (\sqrt{\frac{\log n}{n α_{F} ({\tilde{F}}_{n})}})

and

Υ_{n}^{2} - I E [Υ_{n}^{2}] = O_{a . c o .} (\sqrt{\frac{\log n}{n α_{F} ({\tilde{F}}_{n})}}) .

Next,

\begin{matrix} I E [Υ_{n}^{1}] & = & \frac{1}{α_{F} ({\tilde{F}}_{n})} I E [(q - 1 I_{[Y_{1} \leq A + B D_{1}]}) F_{1}] \\ \leq & \frac{1}{α_{F} (f_{n})} I E |D (Q R_{q} (F) | F) - D ((A + B D_{1}) F_{1} | F)| \\ = & O (f_{n}^{\min (k_{1}, k_{2})}) . \end{matrix}

Similarly,

\begin{matrix} I E [Υ_{n}^{2}] & = & \frac{1}{h α_{F} ({\tilde{F}}_{n})} I E [(q - 1 I_{[Y_{1} \leq A + B D_{1}]}) D_{1} F_{1}] \\ \leq & \frac{1}{h α_{F} (f_{n})} I E |D (Q R_{q} (F) | F)) - D ((A + B D_{1}) D_{1} F_{1} | F)| \\ = & O (f_{n}^{\min (k_{1}, k_{2})}) . \end{matrix}

Therefore,

∥ Υ_{n} ∥ = O (f_{n}^{\min (k_{1}, k_{2})}) + O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2})

□

Lemma A2.

Under assumptions (B1)–(B5), we have

\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) + λ_{0} D ν - Υ_{n} ∥ = o_{a . c o .} (1)

with

D = (\begin{matrix} F (1) - \int_{- 1}^{1} K^{'} (t) β_{F} (t) d t & F (1) - \int_{- 1}^{1} {(t F (t))}^{'} β_{F} (t) d t \\ F (1) - \int_{- 1}^{1} {(t F (t))}^{'} β_{F} (t) d t & F (1) - \int_{- 1}^{1} {(t^{2} F (t))}^{'} β_{F} (t) d t \end{matrix})

and

λ_{0} = d (Q R_{q} (F) | F)

Proof of Lemma A2.

We prove

\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) - Υ_{n} - I E [X_{n} (ν) - Υ_{n}] ∥ = O_{a . c o .} (\sqrt{\frac{\log n}{n α_{F} ({\tilde{F}}_{n})}})

(A1)

and

\sup_{∥ ν ∥ \leq M} ∥ I E [X_{n} (ν) - Υ_{n}] + d (Q R_{q} (F) | F) D ν ∥ = O (f_{n}^{\min (k_{1}, k_{2})}) .

(A2)

Next, use the compactness of the ball

B (0, M)

in

{I R}^{2}

and we write

B (0, M) \subset ⋃_{j = 1}^{d_{n}} B (ν_{j}, l_{n}), ν_{j} = (\begin{matrix} c_{j} \\ d_{j} \end{matrix}) and l_{n} = d_{n}^{- 1} = 1 / \sqrt{n} .

Taking

j (ν) = \arg \min_{j} | ν - ν_{j} |

and we use the fact that

\begin{matrix} \sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) - Υ_{n} - I E [X_{n} (ν) - Υ_{n}] ∥ \\ \leq \sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) - X_{n} (ν_{j}) ∥ \\ + \sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν_{j}) - Υ_{n} - I E [X_{n} (ν_{j}) - Υ_{n}] ∥ \\ + \sup_{∥ ν ∥ \leq M} ∥ I E [X_{n} (ν) - X_{n} (ν_{j})] . \end{matrix}

Since

| 1 I_{[Y < a]} - 1 I_{[Y < b]} | \leq 1 I_{[| Y - b | \leq | a - b |]}

then

\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) - X_{n} (ν_{j}) ∥ \leq \frac{1}{n α_{F} ({\tilde{F}}_{n})} \sum_{i} Z_{i}

where

Ω_{i} = \sup_{∥ ν ∥ \leq M} 1 I_{[| R_{i} - (c_{j} + A) - (f_{n}^{- 1} d_{j} + d) D_{i} | \leq C l_{n}]} ∥(\begin{matrix} 1 \\ f_{n}^{- 1} D_{i} \end{matrix})∥ F_{i} .

Clearly

| Ω_{i} | \leq C, I E [Ω_{i}] = O (l_{n} α_{F} ({\tilde{F}}_{n})) and I E [Ω_{i}^{2}] = O (l_{n} α_{F} ({\tilde{F}}_{n})) .

By using the fact that

l_{n} = o ({(\frac{\log n}{n α_{F} ({\tilde{F}}_{n})})}^{1 / 2})

(A3)

we get

\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν) - X_{n} (ν_{j}) ∥ = O_{a . c o .} ({(\frac{\log n}{n α_{F} ({\tilde{F}}_{n})})}^{1 / 2}) .

For the last term,

\sup_{∥ ν ∥ \leq M} ∥ I E [X_{n} (ν) - X_{n} (ν_{j})] ∥ \leq \frac{1}{α_{F} ({\tilde{F}}_{n})} I E [Ω_{1}] \leq C l_{n}

then, by (A3) we obtain

\sup_{∥ ν ∥ \leq M} ∥ I E [X_{n} (ν) - X_{n} (ν_{j})] ∥ = o_{a . c o .} ({(\frac{\log n}{n α_{F} ({\tilde{F}}_{n})})}^{1 / 2}) .

Now

\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν_{j}) - Υ_{n} - I E [X_{n} (ν_{j}) - Υ_{n}] ∥ .

For this

X_{n} (ν_{j}) - Υ_{n} - I E [X_{n} (ν_{j}) - Υ_{n}] = (\begin{matrix} Θ_{n}^{1} (ν_{j}) \\ Θ_{n}^{2} (ν_{j}) \end{matrix})

where

\{\begin{matrix} Θ_{n}^{1} (ν_{j}) = \frac{1}{n α_{F} (f_{n})} \sum_{i = 1}^{n} Λ_{i}^{1} \\ Θ_{n}^{2} (ν_{j}) = \frac{1}{n f_{n} α_{F} (f_{n})} \sum_{i = 1}^{n} Λ_{i}^{2} \end{matrix}

with

Λ_{i}^{1} = (ζ_{q} (ν_{j}) - ζ_{q} (ν_{0})) F_{i} - I E [(ζ_{q} (ν_{j}) - ζ_{q} (ν_{0})) F_{i}]

and

Λ_{i}^{2} = (ζ_{q} (ν_{j}) - ζ_{q} (ν_{0})) D_{i} F_{i} - I E [(ζ_{q} (ν_{j}) - ζ_{q} (ν_{0})) D_{i} F_{i}] .

clearly

| Λ_{i}^{1} | \leq C and | Λ_{i}^{2} | \leq C^{'} f_{n} .

Moreover,

I E {[Λ_{i}^{1}]}^{2} \leq C α_{F} (f_{n}) and I E {[Λ_{i}^{2}]}^{2} \leq C^{'} f_{n}^{2} α_{F} (f_{n}) .

Therefore, there exits

η > 0

such that

\begin{matrix} \sum_{n} I P (\sup_{∥ ν ∥ \leq M} ∥ X_{n} (ν_{j}) - Υ_{n} - I E [X_{n} (ν_{j}) - Υ_{n}] ∥ \geq η \sqrt{\frac{\log n}{n α_{F} (f_{n})}}) \\ \leq \sum_{n} d_{n} \max_{j} I P (∥ X_{n} (ν_{j}) - Υ_{n} - I E [X_{n} (ν_{j}) - Υ_{n}] ∥ \geq η \sqrt{\frac{\log n}{n α_{F} (f_{n})}}) < \infty \end{matrix}

which complete the proof of (A1).

Concerning (A2) we write

X_{n} (ν) - Υ_{n} = (\begin{matrix} Θ_{n}^{1} (ν) \\ Θ_{n}^{2} (ν) \end{matrix})

where

Θ_{n}^{1} (ν) = \frac{1}{n α_{F} (f_{n})} \sum_{i = 1}^{n} (ζ_{q} (ν) - ζ_{q} (ν_{0})) F_{i}

and

Θ_{n}^{2} (ν) = \frac{1}{n h α_{F} (f_{n})} \sum_{i = 1}^{n} (ζ_{q} (ν) - ζ_{q} (ν_{0})) D_{i} F_{i}

On the other hand, we have

\begin{matrix} I E [Θ_{n}^{1} (ν)] = - \frac{1}{α_{F} (f_{n})} I E [(1 I_{[Y_{1} \leq (c + A) + (f_{n}^{- 1} d + B) D_{1}]} - 1 I_{[Y_{1} \leq A + B D_{1}]}) F_{i}] \\ = - \frac{1}{α_{F} (f_{n})} I E [(D (((c + A) + (f_{n}^{- 1} d + B) D_{1}) | F_{1}) - D ((A + B D_{1}) | F_{1})) F_{1}] \\ = - \frac{1}{α_{F} (f_{n})} I E [(D ((c + A) + (f_{n}^{- 1} d + B) D_{1} | F) - D (A + B D_{1}) | F) F_{1}] + O (f_{n}^{k_{1}}) \\ = - \frac{1}{α_{F} (f_{n})} I E [d (A + B D_{1}) | F (1, f_{n}^{- 1} D_{1}) ν F_{1}] + O (f_{n}^{k_{1}}) + o (∥ ν ∥) \\ = - d (Q R_{q} (F) | F) \frac{1}{α_{F} (f_{n})} (I E F_{1}, f_{n}^{- 1} I E [D_{1} F_{1}]) ν + O (f_{n}^{\min (k_{1}, k_{2})}) + o (∥ ν ∥) \end{matrix}

Similarly,

I E [Θ_{n}^{2} (ν)] = - d (Q R_{q} (F) | F) \frac{1}{h α_{F} (f_{n})} (I E D_{1} F_{1}, f_{n}^{- 1} I E [D_{1}^{2} F_{1}]) ν + O (f_{n}^{\min (k_{1}, k_{2})}) + o (∥ ν ∥) .

Thus

\begin{matrix} I E [X_{n} (ν) - Υ_{n}] & = & - d (Q R_{q} (F) | F) \frac{1}{α_{F} (f_{n})} (\begin{matrix} I E [F_{i}] & I E [F_{i} f_{n}^{- 1} D_{i}] \\ I E [F_{i} f_{n}^{- 1} D_{i}] & I E [f_{n}^{- 2} D_{i}^{2} F_{i}] \end{matrix}) ν \\ + O (f_{n}^{\min (k_{1}, k_{2})}) + o (∥ ν ∥) . \end{matrix}

Obviously

f_{n}^{- a} I E [D_{i}^{- a} F_{i}^{b}] = α_{F} (f_{n}) (F^{b} (1) - \int_{- 1}^{1} {(u^{a} F^{b} (u))}^{'} D (u) d u) + o (α_{F} (f_{n})) .

Hence,

\sup_{∥ ν ∥ \leq M} ∥ I E [X_{n} (ν) - Υ_{n}] + d (Q R_{q} (F) | F) D ν + o (∥ ν ∥) ∥ = O (f_{n}^{\min (k_{1}, k_{2})})

implying the result (A2). □

Lemma A3.

Under assumptions (B1)–(B5), we have

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - I E [Π_{n} (ν, q)] | = O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2}) .

where

Π_{n} (ν, q) = \frac{1}{n α_{F} ({\tilde{F}}_{n})} \sum_{i = 1}^{n} ζ_{q} (ν) (\begin{matrix} 1 \\ f_{n}^{- 1} D_{i} \end{matrix}) F_{i}, f o r ν = (\begin{matrix} c \\ d \end{matrix})

Proof of Lemma A3.

The compactness of

[0, 1]

implies

[0, 1] \subset ⋃_{k = 1}^{d_{n}} [q_{k} - l_{n}, q_{k} + l_{n}], f o r q_{k} \in [0, 1] .

Next, for all

q \in [0, 1]

we put

F_{q} = \arg \min_{k} | q - q_{k} |

and we evaluate the term as function of

ν

and q. We have,

\begin{matrix} \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - I E [Π_{n} (ν, q)] | \leq \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q) | \\ + \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν_{j (ν)}, q) - Π_{n} (ν_{j (ν)}, q_{F_{q}}) | \\ + \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν_{j (ν)}, q_{F_{q}}) - I E [Π_{n} (ν_{j (ν)}, q_{F_{q}})] | \\ + \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | I E [Π_{n} (ν_{j (ν)}, q_{F_{q}})] - I E [Π_{n} (ν, q_{F_{q}})] \\ + \sup_{| ν | \leq M} \sup_{q \in [0, 1]} | I E [Π_{n} (ν, q_{F_{q}})] - I E [Π_{n} (ν, q)] . \end{matrix}

Firstly, we write

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q) | \leq \frac{1}{n I E [F_{1}]} \sum_{i} Ω_{i}^{0}

with

Ω_{i}^{0} = \sup_{| ν | \leq M} \sup_{q \in [0, 1]} 1 I_{[| R_{i} - ν_{j (ν)} - Q R (q | F) | \leq C l_{n}]} F_{i} .

As

| Ω_{i}^{0} | \leq C, I E [Ω_{i}^{0}] = O (l_{n} α_{F} (f_{n})) and I E [Ω_{i}^{0^{2}}] = O (l_{n} α_{F} (f_{n}))

we get

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q) = O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2})

and

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | I E [Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q)] = o ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2}) .

Similarly,

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν_{j (ν)}, q) - Π_{n} (ν_{j (ν)}, q_{F_{q}}) | \leq \frac{1}{n I E [F_{1}]} \sum_{i} Ω_{i}^{1}

with

Ω_{i}^{1} = \sup_{| ν | \leq M} \sup_{q \in [0, 1]} (1 I_{[| R_{i} - ν_{j (ν)} - t (q | F) | \leq C l_{n}]} + 1 I_{[| R_{i} - t (q_{F_{q}} | F) | \leq C l_{n}]}) F_{i} .

Since

| Ω_{i}^{1} | \leq C, I E [Ω_{i}^{1}] = O (l_{n} α_{F} (f_{n})) and I E [Ω_{i}^{0^{1}}] = O (l_{n} α_{F} (f_{n}))

we obtain

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q) = O_{a . c o .} ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2})

and

\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | I E [Π_{n} (ν, q) - Π_{n} (ν_{j (ν)}, q)] = o ({(\frac{\log n}{n α_{F} (f_{n})})}^{1 / 2}) .

By the same manner

Π_{n} (ν_{j}, q_{k}) - I E [Π_{n} (ν_{j}, q_{k})] = \frac{1}{n I E [F_{1}]} \sum_{i = 1}^{n} Λ_{i}^{'}

where

Λ_{i}^{'} = (1 I_{R_{i} \leq t (q_{k} | F)} - 1 I_{R_{i} \leq ν_{j} + t (q_{k} | F)}) F_{i} - I E [(1 I_{R_{i} \leq t (q_{k} | F)} - 1 I_{R_{i} \leq ν_{j} + t (q_{k} | F)}) F_{i}] .

By Bernstein inequality

| Λ_{i} | \leq C and I E {[Λ_{i}]}^{2} \leq C α_{F} (f_{n})

to show that there exits

η > 0

such that

\begin{matrix} \sum_{n} I P (\sup_{| ν | \leq M} \sup_{q \in [0, 1]} | Π_{n} (ν_{j (ν)}, q_{F_{q}}) - I E [Π_{n} (ν_{j (ν)}, q_{F_{q}})] | \geq η \sqrt{\frac{\log n}{n α_{F} (f_{n})}}) \\ \leq \sum_{n} d_{n}^{2} \max_{j} \max_{k} I P (| Π_{n} (ν_{j}, q_{k}) - I E [Π_{n} (ν_{j}, q_{k})] | \geq η \sqrt{\frac{\log n}{n α_{F} (f_{n})}}) < \infty . \end{matrix}

Therfore,

\sup_{q \in [0, 1]} \sup_{| ν | \leq M} | Π_{n} (ν, q) - Υ_{n} - I E [Π_{n} (ν, q) - Υ_{n}] | = O_{a . c o .} (\sqrt{\frac{\log n}{n α_{F} (f_{n})}})

Obserrve that by the same techniques we prove that

\sup_{q \in [0, 1]} | Υ_{n} - I E [Υ_{n}] | = O_{a . c o .} (\sqrt{\frac{\log n}{n α_{F} (f_{n})}})

□

References

Ramsay, J.O.; Silverman, B.W. Functional Data Analysis, 2nd ed.; Springer: New York, NY, USA, 2005. [Google Scholar]
Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis. Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
Xu, Y. Functional Data Analysis. In Springer Handbook of Engineering Statistics; Springer: London, UK, 2023; pp. 67–85. [Google Scholar]
Ahmed, M.S.; Frévent, C.; Génin, M. Spatial Scan Statistics for Functional Data. In Handbook of Scan Statistics; Springer: New York, NY, USA, 2024; pp. 629–645. [Google Scholar]
Collomb, G.; Härdle, W.; Hassani, S. A note on prediction via estimation of the conditional mode function. J. Stat. Plan. Inference 1986, 15, 227–236. [Google Scholar] [CrossRef]
Quintela-Del-Rio, A.; Vieu, P. A nonparametric conditional mode estimate. J. Nonparametr. Stat. 1997, 8, 253–266. [Google Scholar] [CrossRef]
Ioannides, D.; Matzner-Løber, E. A note on asymptotic normality of convergent estimates of the conditional mode with errors-in-variables. J. Nonparametr. Stat. 2004, 16, 515–524. [Google Scholar] [CrossRef]
Louani, D.; Ould-Saïd, E.L.I.A.S. Asymptotic normality of kernel estimators of the conditional mode under strong mixing hypothesis. J. Nonparametr. Stat. 1999, 11, 413–442. [Google Scholar] [CrossRef]
Bouzebda, S.; Khardani, S.; Slaoui, Y. Asymptotic normality of the regression mode in the nonparametric random design model for censored data. Commun. Stat. Theory Methods 2023, 52, 7069–7093. [Google Scholar] [CrossRef]
Bouzebda, S.; Didi, S. Some results about kernel estimators for function derivatives based on stationary and ergodic continuous time processes with applications. Communications in Statistics. Theory Methods 2022, 51, 3886–3933. [Google Scholar] [CrossRef]
Ferraty, F.; Laksaci, A.; Vieu, P. Estimating some characteristics of the conditional distribution in nonparametric functional models. Stat. Inference Stoch. Process. 2006, 9, 47–76. [Google Scholar] [CrossRef]
Bouzebda, S.; Chaouch, M.; Laïb, N. Limiting law results for a class of conditional mode estimates for functional stationary ergodic data. Math. Methods Stat. 2016, 25, 1066–5307. [Google Scholar] [CrossRef]
Ezzahrioui, M.H.; Ould-Saïd, E. Asymptotic normality of a nonparametric estimator of the conditional mode function for functional data. J. Nonparametr. Stat. 2008, 20, 3–18. [Google Scholar] [CrossRef]
Ezzahrioui, M.H.; Saïd, E.O. Some asymptotic results of a non-parametric conditional mode estimator for functional time-series data. Stat. Neerl. 2010, 64, 171–201. [Google Scholar] [CrossRef]
Dabo-Niang, S.; Kaid, Z.; Laksaci, A. Asymptotic properties of the kernel estimate of spatial conditional mode when the regressor is functional. AStA Adv. Stat. Anal. 2015, 99, 131–160. [Google Scholar] [CrossRef]
Bouanani, O.; Laksaci, A.; Rachdi, M.; Rahmani, S. Asymptotic normality of some conditional nonparametric functional parameters in high-dimensional statistics. Behaviormetrika 2019, 46, 199–233. [Google Scholar] [CrossRef]
Azzi, A.; Belguerna, A.; Laksaci, A.; Rachdi, M. The scalar-on-function modal regression for functional time series data. J. Nonparametr. Stat. 2024, 36, 503–526. [Google Scholar] [CrossRef]
Stone, C.J. Consistent nonparametric regression. Ann. Stat. 1977, 5, 595–620. [Google Scholar] [CrossRef]
Koenker, R.; Zhao, Q. Conditional quantile estimation and inference for ARCH models. Econom. Theory 1996, 12, 793–813. [Google Scholar] [CrossRef]
Hallin, M.; Lu, Z.; Yu, K. Local linear spatial quantile regression. Bernoulli 2009, 15, 659–686. [Google Scholar] [CrossRef]
Cardot, H.; Crambes, C.; Sarda, P. Quantile regression when the covariates are functions. J. Nonparametr. Stat. 2005, 17, 841–856. [Google Scholar] [CrossRef]
Wang, H.; Ma, Y. Optimal subsampling for quantile regression in big data. Biometrika 2021, 108, 99–112. [Google Scholar] [CrossRef]
Jiang, Z.; Huang, Z. Single-index partially functional linear quantile regression. Commun. Stat.-Theory Methods 2024, 53, 1838–1850. [Google Scholar] [CrossRef]
Dabo-Niang, S.; Kaid, Z.; Laksaci, A. Spatial conditional quantile regression: Weak consistency of a kernel estimate. Rev. Roum. Math. Pures Appl. 2012, 57, 311–339. [Google Scholar]
Chowdhury, J.; Chaudhuri, P. Nonparametric depth and quantile regression for functional data. Bernoulli 2019, 25, 395–423. [Google Scholar] [CrossRef]
Mutis, M.; Beyaztas, U.; Karaman, F.; Lin Shang, H. On function-on-function linear quantile regression. J. Appl. Stat. 2025, 52, 814–840. [Google Scholar] [CrossRef] [PubMed]
Fan, J. Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66; Routledge: Abingdon-on-Thames, UK, 2018. [Google Scholar]
Rachdi, M.; Laksaci, A.; Demongeot, J.; Abdali, A.; Madani, F. Theoretical and practical aspects of the quadratic error in the local linear estimation of the conditional density for functional data. Comput. Stat. Data Anal. 2014, 73, 53–68. [Google Scholar] [CrossRef]
Baíllo, A.; Grané, A. Local linear regression for functional predictor and scalar response. J. Multivar. Anal. 2009, 100, 102–111. [Google Scholar] [CrossRef]
Barrientos-Marin, J.; Ferraty, F.; Vieu, P. Locally modelled regression and functional data. J. Nonparametr. Stat. 2010, 22, 617–632. [Google Scholar] [CrossRef]
Berlinet, A.; Elamine, A.; Mas, A. Local linear regression for functional data. Ann. Inst. Stat. Math. 2011, 63, 1047–1075. [Google Scholar] [CrossRef]
Demongeot, J.; Laksaci, A.; Madani, F.; Rachdi, M. Functional data: Local linear estimation of the conditional density and its application. Statistics 2013, 47, 26–44. [Google Scholar] [CrossRef]
Messaci, F.; Nemouchi, N.; Ouassou, I.; Rachdi, M. Local polynomial modelling of the conditional quantile for functional data. Stat. Methods Appl. 2015, 24, 597–622. [Google Scholar] [CrossRef]
Laksaci, A.; Ould Saïd, E.; Rachdi, M. Uniform consistency in number of neighbors of the k NN estimator of the conditional quantile model. Metrika 2021, 84, 895–911. [Google Scholar] [CrossRef]
Ota, H.; Kato, K.; Hara, S. Quantile regression approach to conditional mode estimation. Electron. J. Stat. 2019, 13, 3120–3160. [Google Scholar] [CrossRef]
Almulhim, F.A.; Alamari, M.B.; Bouzebda, S.; Kaid, Z.; Laksaci, A. Robust Estimation of L 1-Modal Regression Under Functional Single-Index Models for Practical Applications. Mathematics 2025, 13, 602. [Google Scholar] [CrossRef]
Belarbi, F.; Chemikh, S.; Laksaci, A. Local linear estimate of the nonparametric robust regression in functional data. Stat. Probab. Lett. 2018, 134, 128–133. [Google Scholar] [CrossRef]
Iglesias Pérez, M.D.C. Estimación de la función de distribución condicional en presencia de censura y truncamiento: Una aplicación al estudio de la mortalidad en pacientes diabéticos. Estad. Esp. 2003, 45, 275–302. [Google Scholar]
Yuan, M. GACV for quantile smoothing splines. Comput. Stat. Data Anal. 2006, 50, 813–829. [Google Scholar] [CrossRef]
Panchbhai, K.G.; Lanjewar, M.G. Portable system for cocoa bean quality assessment using multi-output learning and augmentation. Food Control 2025, 174, 111234. [Google Scholar] [CrossRef]

Figure 1. Smooth curves

F_{i} (t)

for 100 equispaced measurements t in

(0, 1)

, as displayed on the X-axis.

Figure 1. Smooth curves

F_{i} (t)

for 100 equispaced measurements t in

(0, 1)

, as displayed on the X-axis.

Figure 2. Rough curves

F_{i} (t)

for 100 equispaced measurements t in

(0, 1)

, as displayed on the X-axis.

Figure 2. Rough curves

F_{i} (t)

for 100 equispaced measurements t in

(0, 1)

, as displayed on the X-axis.

Figure 3. NIR curves.

Figure 4. Moisture prediction.

Figure 5. Fat prediction.

Table 1.

M S E

for different estimators in different scenarios.

Table 1.

M S E

for different estimators in different scenarios.

Cond. Dist.	Model	CURVES	$\hat{MR}$	$\bar{MR}$	$\tilde{MR}$
Normal	Hom.model	Smooth	0.0258	0.0897	0.1672
		Rough	0.1067	0.4676	0.5679
Normal	Het.model	Smooth	0.2672	0.7426	1.0578
		Rough	0.3617	1.2725	1.8811
Laplace	Hom. model	Smooth	0.4257	0.6150	0.6317
		Rough	0.9804	1.0922	1.1788
Laplace	Het. model	Smooth	0.8967	1.6824	1.7088
		Rough	0.9176	2.4521	2.6588
Weibull	Hom. model	Smooth	0.5179	1.5005	1.5446
		Rough	0.8399	2.4873	2.7098
Weibull	Het. model	Smooth	0.7840	1.6253	1.4102
		Rough	0.9705	3.3567	4.3456

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Almulhim, F.A.; Alamari, M.B.; Laksaci, A.; Kaid, Z. Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case. Axioms 2025, 14, 537. https://doi.org/10.3390/axioms14070537

AMA Style

Almulhim FA, Alamari MB, Laksaci A, Kaid Z. Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case. Axioms. 2025; 14(7):537. https://doi.org/10.3390/axioms14070537

Chicago/Turabian Style

Almulhim, Fatimah A., Mohammed B. Alamari, Ali Laksaci, and Zoulikha Kaid. 2025. "Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case" Axioms 14, no. 7: 537. https://doi.org/10.3390/axioms14070537

APA Style

Almulhim, F. A., Alamari, M. B., Laksaci, A., & Kaid, Z. (2025). Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case. Axioms, 14(7), 537. https://doi.org/10.3390/axioms14070537

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case

Abstract

1. Introduction

2. The $L^{1}$ Conditional Mode and Its Local Linear Estimator

3. Main Results

4. On the Potential Impact of the Contribution

5. Computational Part

5.1. Simulation Study

5.2. Real Data Application

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proofs of Intermediate Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case

Abstract

1. Introduction

2. The L 1 Conditional Mode and Its Local Linear Estimator

3. Main Results

4. On the Potential Impact of the Contribution

5. Computational Part

5.1. Simulation Study

5.2. Real Data Application

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Proofs of Intermediate Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2. The $L^{1}$ Conditional Mode and Its Local Linear Estimator