Kolmogorov Entropy for Convergence Rate in Incomplete Functional Time Series: Application to Percentile and Cumulative Estimation in High Dimensional Data

The convergence rate for free-distribution functional data analyses is challenging. It requires some advanced pure mathematics functional analysis tools. This paper aims to bring several contributions to the existing functional data analysis literature. First, we prove in this work that Kolmogorov entropy is a fundamental tool in characterizing the convergence rate of the local linear estimation. Precisely, we use this tool to derive the uniform convergence rate of the local linear estimation of the conditional cumulative distribution function and the local linear estimation conditional quantile function. Second, a central limit theorem for the proposed estimators is established. These results are proved under general assumptions, allowing for the incomplete functional time series case to be covered. Specifically, we model the correlation using the ergodic assumption and assume that the response variable is collected with missing at random. Finally, we conduct Monte Carlo simulations to assess the finite sample performance of the proposed estimators.


Introduction
Statistical problems associated with the study of functional random variables, that is, variables with values in an infinite-dimensional space, have garnered increasing attention in the statistical literature over the past few decades. The abundance of data measured on increasingly fine temporal/spatial grids, as is the case in meteorology, medicine, satellite imagery, and numerous other research fields, has inspired the development of this research theme. Thus, the statistical modeling of these data as random functions resulted in a number of difficult theoretical and numerical research concerns; we may refer to [1][2][3][4][5][6] for parametric and nonparametric models. For the latest contributions in FDA and its related topics, one can refer to [7][8][9][10][11][12][13][14].
Quantile regression has emerged as a significant statistical technique for data analysis since Koenker and Bassett's [15] seminal work. But, concerns about quantile crossing and model misspecification [16][17][18] have led to the development of the nonparametric estimation of conditional quantile functions [19,20]. This estimate originates from [21], who proved the convergence using the probability of the empirical conditional law function; refer to [22,23]. This technique is an alternative to mean regression, and it possesses many desirable properties, such as being more efficient than mean regression when the data follow a heavy-tailed distribution, and it is frequently used to characterize the entire conditional distribution of the response. Conditional percentile estimation has been studied extensively by several researchers [24][25][26]. However, most of them used the kernel method approach. For instance, Ferraty et al. [4] have studied the uniform convergence of such an estimation in the analogously distributed and independent case. However, in the same case, Samanta [27] acquired the almost complete (a.co.) convergence of the conditional percentile estimation.
Studies of local linear estimation conditional quantile function (LLECQF) are still limited. For example, Messaci et al. [28] have studied the local linear estimation (LLE) of the conditional quantile function (CQF) by reversing the estimator shown in [29]. Al-Awadhi et al. [30] have proved the a.co. convergence and the asymptotic law of the CQF by considering an estimator based on the L 1 approach. Despite its importance in functional data analysis, the LLE has various merits over the kernel technique. Generally, this method can reduce the bias properties of the kernel approach [31,32]. Furthermore, the LLE has been lately introduced in the FDA by [33]. The latter concentrates on the LLE of the curve regression while the insert variable is a Hilbert class. Barrientos-Marin et al. [34] have proposed the LLE of the nonparametric regression operator and studied their asymptotic properties. In particular, this operator can be applied to functional covariable.
In this article, we are interested in studying the local linear estimation of the conditional cumulative distribution function (LLECCDF) and LLECQF under the assumption of ergodicity. We are interested in the uniform a.co. convergence of the constructed sequences using the Kolmogorov entropy function. In addition to Kolmogorov entropy, other information measures such as Kullback-Leibler divergence have been considered for a convergence rate study of estimators in multivariate time series modelling; refer to [35,36].
From a practical point of view, the ergodic framework is an essential condition in statistical physics, number theory, Markov chains, and other fields. The concept of ergodicity is fundamental in the research of stochastic processes. Note also that one of the arguments invoked by [37] motivating the introduction of the concept of ergodicity is that, for certain classes of processes, it can be much easier to prove ergodic properties rather than the mixing condition. Hence, the ergodicity hypothesis seems to be the most naturally adapted and provides a better framework to study data series such as those generated by noisy chaos.
In their discussion, [38] provided an example of processes that are ergodic but not mixing, which may be summarized as follows: let {(T i , λ i ) : i ∈ Z} be a strictly stationary process such that T i | T i−1 is a Poisson process with a parameter λ i , where T i is the σ-field generated by (T i , λ i , T i−1 , λ i−1 , . . .). Assume that λ i = f (λ i−1 , T i−1 ), where f : [0, ∞) × N → (0, ∞) is a given function. This process does not mix in general (see Remark 3 of [39]). It is known that any sequence (ε i ) i∈Z of i.i.d. random variables is ergodic. Hence, it is immediately clear that (Y i ) i∈Z with Y i = ϑ((. . . , ε i−1 , ε i ), (ε i+1 , ε i+2 , . . .)), for some Borel-measurable function ϑ(·), is also ergodic; see Proposition 2.10 on page 54 in [40]. Under the condition of ergodicity, [41] have studied the conditional quantile for ergodic data by considering an iterative model. However, under the same conditions, the authors in [42] have considered the nonparametric estimation of quantile for censored data.
All these studies were concerned with the complete-data situation. This work investigates the question of when data are missing at random (MAR); for instance, see [43]. In contemporary statistics, missing data are pervasive, posing a significant obstacle for various applications. The concept of missing data in statistics occurs when a data value of the variable is not stored in the observation. For example, missingness may occur when our data are compiled from sources that have measured various variables; for instance, in the healthcare industry, the data routinely collected on patients may vary between clinics and hospitals. Among numerous other reasons are sensor failure, data censorship, privacy concerns, pharmaceutical tracing tests, and reliability tests. It can occur in any experimental setting where contamination of the treatment or subject mortality is possible. This topic has been extensively examined in numerous statistical problems; see [43,44] for a thorough overview. In the present work, we only observe Y in cases where some indicator B equals one and the indicator B is conditionally independent of Y given X. This assumption is useful when information in the form of covariate data is available to explain the missingness; refer to [45][46][47]. The first studies on MAR are presented by [48]. They established an approximation of the regression operator and studied their asymptotic consistency when the curve regressor is observed and the interest response is missing at random. However, in the ergodic data case, Ling et al. [49] have studied the asymptotic distribution of the estimator, which is proposed in [48]; also, refer to [50,51].
The main objective of this paper is to evaluate the convergence velocity of some functional estimators through the Kolmogorov entropy function. More precisely, we focus on the local linear smoothing of the distribution function and its inverse, the quantile function. The constructed estimators' asymptotic properties are evaluated when the data are correlated as functional ergodic time series data and the response variable is observed under the MAR structure. The efficiency of these estimators is uniformly specified using the entropy metric, allowing us to assess the impact of the functional path of the data. In particular, the Kolmogorov entropy gives a trade-off between the data's sparsity and the approximation's efficiency. Moreover, the Kolmogorov entropy explores the functional space's topological structure and its spectral property. Thus, stating the uniform consistency concerning the Kolmogorov entropy is more beneficial than the classical pointwise case. Although the uniform convergence of LLECCDF and LLECQF in the functional ergodic time series (FETS) structure is purely mathematically challenging, the obtained results are also pivotal for many applied issues, such as the smoothing parameter choice, bootstrapping, and the single index modeling. The second challenging issue of this contribution concerns the MAR feature of the response variable. Our results can be used to determine an estimator of the unconditional distribution of the scalar response, even if it is not completely observed. All these challenging issues will be discussed using specific examples in Section 6. In addition to the uniform consistency, we prove the asymptotic normality of the LLECCDF, which is important to provide a confidence interval comparable with the predictive interval deduced from LLECQF. Once again, this prediction using the predictive subset is also primordial in the context of incomplete functional time series data. Finally, we point out that, to our best knowledge, this problem of uniform consistency of local linear approximation under MAR and FETS structures was open up to the present, giving the main motivation to our paper.
The rest of the paper is organized as follows. In Section 2, we state the formal setup and define the estimators. More precisely, Section 2.1 is devoted to the LLECCDF estimator, while Section 2.2 introduces the LLECQF estimator. The convergence rate of the two approximation sequences is established in Section 3. In Section 4, we derive the limiting distribution of the proposed estimators. In Section 5, we conduct Monte Carlo simulations to assess the finite sample performance of the proposed estimators. Section 6 is devoted to highlighting the principal features of our contribution. In Section 7, we give some concluding remarks. All the proofs are gathered in the last section.

LLECCDF: Numerical Approximation of CCDF-Model
Let {(X i , Y i ) : 1 ≤ i ≤ n} be a sequence of stationary ergodic functional random variables identically distributed as (X, Y), where X takes values in a some semi-metric abstract space F with a semi-metric d(·, ·) and Y takes values in R. For reader convenience, we introduce some details defining the ergodic property of processes and its link with the mixing one. Let {X n , n ∈ Z} be a stationary sequence. Consider the backward field A n = σ(X k : k ≤ n) and the forward field B m = σ(X k : k ≥ m). The sequence is strongly mixing if sup The sequence is ergodic if, for any two measurable sets A, B, where τ is the time evolution or shift transformation taking X k into X k+1 . We shall also use the same symbol τ to denote the induced set transformation, which takes, for example, sets B ∈ B m into sets τB ∈ B m+1 ; for instance, see [52]. The naming of strong mixing in the above definition is more stringent than what is ordinarily referred to (when using the vocabulary of measure preserving dynamical systems) as strong mixing, namely to that lim n→∞ P(A ∩ τ −n B) = P(A)P(B) for any two measurable sets A, B; see, for instance [52,53] and more recent references [54][55][56][57][58][59][60]. Hence, strong mixing implies ergodicity, whereas the inverse is not always true (see, e.g., Remark 2.6 in page 50 in connection with Proposition 2.8 in page 51 in [40]). For every x ∈ F , the function of conditional law CFD(y|x) of Y when X = x is defined by The LLECCDF is obtained by assuming for every z in the vicinity of x where α(·, ·) is a bilinear locating function such that where δ(·, ·) is a bilinear function such that |δ(·, ·)| = d(·, ·). In the rest of the paper, we suppose that the CDF is of the C 1 -class with respect to y and its derivative is the conditional density function denoted by cd f (·). However, in the case of a missing response, one has an incomplete sample of size n from (X, Y, B), which is usually denoted Bernoulli random variable B is supposed to be such that is a continuous function. Under this smoothing consideration, we define the LLECCDF of CDFF(· | ·) by finding the minimizers ( β 1 , where λ K := λ K,n and λ J := λ J,n are bandwidth parameters and J (·) and Ker(·) are, respectively, distribution and kernel functions. The explicit solution to this minimization is and The first main contribution of this work is a precise convergence rate of the approximation CDF(· | ·) uniformly over a non-necessary compact subset C F of F . We use the notation Definition 1. Let C F be a subset of a semi-metric space F , and let ε > 0 be given. A finite set of points x 1 , Ba(x k , ε).
The quantity ψ C F (ε) = log(N ε (C F )), where N ε (C F ) =: d n is the minimal number of open balls in F of radius ε which is necessary to cover C F , is called Kolmogorov's ε-entropy of the set C F .
This concept was introduced by Kolmogorov in the mid-1950s (refer to [61]). It serves as a measure of the complexity of a set, indicating that high entropy implies that a significant amount of information is required to accurately describe an element within a given tolerance, ε. Consequently, the selection of the topological structure (specifically, the choice of semi-metric) plays a crucial role when examining asymptotic results that are uniform over a subset, C F of F . In particular, we subsequently observe that a wellchosen semi-metric can enhance the concentration of the probability measure for the functional variable, X, while minimizing the ε-entropy of the subset, C F . Ferraty and Vieu [6] emphasized the phenomenon of concentration of the probability measure for the functional variable by calculating small ball probabilities in different standard scenarios; refer to [62]. For readers interested in these concepts (entropy and small ball probabilities) and/or the utilization of Kolmogorov's ε-entropy in dimensionality reduction problems, we recommend referring to [63] or/and [64], respectively. Let (u n ) for n ∈ N, be a sequence of real r.v.s. We say that (u n ) converges almostcompletely (a.co.) toward zero if, and only if, for all > 0, Moreover, we say that the rate of the almost-complete convergence of (u n ) toward zero is of order v n (with v n → 0), and we write This kind of convergence implies both the almost-sure convergence and the convergence in probability.

LLECQF: Numerical Approximation of CQF-Model
The second approximation concerns the CQF-Model of order p, for p ∈ (0, 1), denoted as CQF p (x): The natural estimator is obtained by inverting the LLECCDF. However, unlike the local constant estimator, the LLECCDF is not monotone and invertible. To overcome this issue, we use the robust definition of the CQF p (x): where the scoring function L p (y) = |y|(p − 1 1 [y<0] ), with 1 1 S as the indicator of S. Once again, the LLECQF is obtained via a smoothing approximation of the model locally in the neighborhood of the location x. Therefore, we suppose that CQF p (x) is such that Therefore, β 1 and β 2 are estimated as arg min So, the LLECQF of CQF p (x) is denoted by Once again, our main focus is to establish the convergence rate using the Kolmogorov entropy and the uniform-Lipschitzian condition: For all (y 1 , y 2 ) ∈ N 2 y and x 1 , for j = 0, 1 and C 2 , k 1 , k 2 > 0.

Uniform Convergence Rate
Let F i and G i (i = 1, . . . , n) be the σ-algebras generated by ((Y 1 , The function ψ x (·, ·) plays a similar role to small ball probability as in [34]. Finally, let C and C be some generic constants that are strictly positive. To obtain the convergence rate in this functional ergodic MAR scheme, we consider the following assumptions: (H1) The first-order derivative of ψ(·) exists and is bounded in C F . The function ψ(·) is such that (H2) There exists a non-random function ψ i (·) satisfying There exists a non-random function Ψ(·) satisfying ∀t ∈ [−1, 1], lim (H3) The distribution function J (·) and the kernel Ker(·) fulfill the following: The function J (·) has a derivative satisfying (H4) The real sequence d n associated with r n = O ln n n satisfies (H5) The bandwidth λ K is linked to α(·) and ψ(·) by where P is the law of X.
We first recall that our assumptions are not restrictive and may be considered as standard in the functional local linear analysis context. They are similar to those used in the local linear estimation of the quantile regression in [30]. In particular, Assumption (H1) concerns the usual concentration property of the functional variable. It is well documented that this property allows us to explore the functionality nature of the data. Assumption (H3)(iii) is a Markov-type condition and characterizes the conditional moments. It is satisfied when considering, for instance, the model where ( i ) is a square-integrable process independent of (X i ). Finally, Assumptions (H3)(i)-(ii), (H4), and (H5) are technical conditions, similar to those used by [30].
The following theorem gives the uniform, almost complete convergence of CDF(y | x) with the rate.
The proof of Theorem 1 is postponed until the last section. The following theorem gives the uniform, almost complete convergence of CQF α (x) with the rate.

Asymptotic Normality
The second asymptotic result concerns the asymptotic law of the sequence CDF(y | x). To do that, we enhance Assumption (H5) by assuming that the kernel Ker(·) satisfies (H4) and has a first derivative Ker (·) such that Let us now state the following theorem, which gives the central limit theorem of the estimator CDF(y | x). Below, we write Z D = N (µ, σ 2 ) whenever the random variable Z follows a normal law with expectation µ and variance σ 2 ; D → denotes the convergence in distribution.

Theorem 3. Under the assumptions of Theorem 1 and if the smoothing parameters
The proof of Theorem 3 is postponed until the last section. Clearly, this asymptotic result has many applications in practice. In particular, it can be used to build a confidence interval for the true value of CDF(y | x). The latter is obtained by estimating the asymptotic variance V JK (y | x) using a plug-in approach. Indeed, M 1 and M 2 are estimated via . Therefore, the natural estimator of the asymptotic variance V JK (y | x) is Under this consideration, the (1 − ζ) confidence interval for CDF(y | x) is where t 1−ζ/2 denotes the 1 − ζ/2 quantile of the standard normal distribution. We note that the function ψ(·) does not appear in the calculation of the confidence interval since it will be simplified.

Numerical Results
In this computational study, we illustrate the three fundamental axes of our topic that are the functional structure, the asymptotic normality, and the local linear smoothing. In the first illustration, we examine the functional structure's impact on the constructed estimators' convergence rate. To cover the general features of our study, the ergodicity, and the MAR features, we generate an artificial functional time series using the Hilberthian autoregressive processes. Of course, this kind of process's linearity allows us to incorporate our theoretical assumption concerning the ergodicity structure. To do this, we employ the routine-code fts.rar from the R package freedom.fda. The latter has a nice feature that is based on the dynamic functional principal component analysis (DFPCA); for instance, see [65]. In this empirical study, we use DFPCA to generate the ergodic function using specific basis functions. Specifically, we have used the Fourier basis functions (FBF) to obtain the functional (X i ). Formally, where Υ is an operator with kernel ψ(·, ·) and ε i is the white noise. The kernel operator is expressed by In the cited routine code Υ is constructed from FBF {u j : j = 1, . . . , d} by taking (ψ ij ) ij = ( Ψ(u i ), u j ) as the corresponding matrix. The function op.norms controls the degree of dependency in these functional observations for practical purposes. The functional observed regressors are plotted in Figure 1.
Specifically, such a sample was obtained by taking arbitrary, values allowing it to cover various degrees of ergodicity. Secondly, the variable of interest is generated using the regression equation where is drown from N(0, 0.5). With this consideration, the conditional law of Y given X is a normal distribution with a mean equal The missing feature is controlled using the conditional probability of observation: Under this consideration, the scalar γ controls the missing rate. Once again, we simulate with several values of γ to evaluate this characteristic's influences in the estimators. The sequences LLECCDF and LLECQF are computed using the (−1, 1)-quadratic kernel and the locally cross-validation on the number of neighborhoods using the mean square error (MSE) referring to the estimation leave-one-out-curve of the conditional median. Clearly, the Kolmogorov entropy is an important factor of the topological structure. The latter is controlled through the locating functions α and δ. For the sake of shortness, we simulate α = δ equal to the L 2 -distance using three between the qth derivatives of the curves based on the basis function, as well as the PCA-semi metric associated with the eigenfunctions of the empirical version covariance operator. Exactly, we compute this metric using the m first one. The MSE was evaluated over various values of q and m. In Tables 1 and 2, we summarize the MSE of both estimators for various values of mentioned parameters, the level of dependency (op.norms), the missing rate γ, and the metric parameter q or m.
It is clear that the behavior of the estimator is strongly impacted by different parameters of the estimators, including the dependency-level, the missing rate, and the topological structure. Without surprise, the topological structure has an important role. In a sense, the variability of the MSE as a function of the metric type is more important than the other parameters. All in all, the effectiveness of the estimator is also affected by the quantities (op.norms), γ, and q or m. In conclusion, we can say that the computational study confirms the theoretical statement that the convergence rate is strongly affected by the topological structure of the functional regressor. The second illustration concerns the quality of the asymptotic normality result in Theorem 3. Specifically we aim to examine the behavior of the asymptotic distribution with respect to the degree of correlation, as well as the missing rate. For this purpose, we repeat the previous sampling processes independently m times, and for each time, we calculate the quantity: Recall that the true conditional law of Y given X is known justifying the use of the true conditional cumulative distribution function of CDF(y | x). Observe also that the estimation of the function ψ(λ K ) is not necessary. It will be simplified using the definition of M 1 and M 2 (see their definitions in Section 4). Now, the m-sample of the quantity (9) is calculated using the ideas of the first illustration concerning the construction of CDF(y | x). Moreover, the estimators M 1 , M 2 , and P(x) are obtained in the same manner. Specifically, we use the same bandwidth, the same kernel, and the metric associated the FBF with q = 0. Of course, the m-sample of (9) is drawn for a fixed location curve x = X i 0 and fixed point y = Y i 0 . The index i 0 is randomly chosen independently from the sampling process. Furthermore, the behavior of the asymptotic distribution of the quantity (9) is examined by estimating the density of the obtained m-sample and we compare it to the density of the standard normal distribution. In order to evaluate the effect of the dependency degree and the missing rate on the accuracy of the asymptotic normality, we perform our sampling process using various values of op.norms and γ. Exactly, we keep the same value as that of the first illustration. Finally, we plot in Figures 2-5 the estimated density, obtained via the routine code density with m = 120, against the density of N(0, 1). The continuous line represents the estimated density, and the dashed line represents the true density.   Once again, this empirical analysis confirms the theoretical development. In a sense, the estimation approach is strongly impacted by the degree of dependency, as well as the missing rate. Typically, even if the curve of the estimated density is relatively close to the normal density, the accuracy of the asymptotic normality is significantly varied with respect to the values of op.norms and γ. It appears that the effect of the missing rate is more important than the degree of dependency. This conclusion is justified by the fact that the missing rate impacts both the bias and the variance terms, whereas the dependency feature impacts only the variance part. To confirm this statement, we report in Table 3 the bias quantities of the different values of (op.norms) and γ. The bias term is is the LLECCDF obtained with the j th sample. The result of Table 3 shows that the effect of the data correlation is too small compared with the variability with respect to the missing rate. The third illustration concerns the used estimation method that is the local linear approach. More precisely, we compare it with the classical kernel method. We concentrate on this part in the second model, which is the quantile regression. It is well known that this kind of model has a many scopes of application. One of the important application areas is the prediction problem. At this stage, the quantile regression can be used as single-point predictor when p = 0.5 or as a predictive interval [CQF p/2 , CQF 1−p/2 ]. The latter ensures the existence of the response variable Y in [CQF p/2 (x), CQF 1−p/2 (x)] with a probability equal to 1 − p. In order to show the easy implantation of the estimator CQF α (·) and to highlight its advantages over the classical kernel method studied by [4], we compare the efficiency of both estimation methods on the construction of the [CQF p/2 (x), CQF 1−p/2 (x)] interval. Undoubtedly, the performance of any predictive interval is measured using two factors that are the coverage probability and the length of the interval. However, for the sake of brevity, we focus in this third illustration only on the coverage probability factor that measures the belonging percentage to the approximated predictive interval. For this aim, we employ the same sampling process as the second illustration, and for each sampling time j, we split the observations into learning and testing sample. We determine [CQF p/2 (x 0 ),CQF 1−p/2 (x 0 )] for all point x 0 in the testing sample, andCQF means either the local linear or kernel estimator of the quantile CQF. For the computation aspect, we use the routine code funopare.quantile.lcv for the kernel method, whereas that for the local linear method is obtained by minimizing (3) on a regular grid of 100 points, from (0.9 * min(Y i ), 1.1 * max(Y i )). To the end, let us point out that we have used the same selection strategies as the previous illustrations, namely, the same bandwidth, the same kernel, and the same FBF metric. In order to give a comprehensive comparison, we examine the efficiency of the predictive interval processing for various values of p. Once again, the estimation quality is strongly affected by the two principal features that are the dependency and the missing phenomena. However, it seems that the local linear algorithm is more robust and more preferment in the sense that its behavior is more stable than the kernel method. To confirm this statement, we summarise in Table 4, the absolute coverage probability error defined by where CP i represents the coverage probability of the predictive interval associated with the threshold α i .

Discussion and Comments
• Gap between the pointwise and uniform convergence in FDA: We specify the convergence rate over some known functional structures to highlight the gap between the pointwise and uniform convergence in functional statistics. The first one concerns the resealed version (W c (t) = W(t/c)) t≥0 of Gaussian process W(t) t≥0 . If we assume that the spectral measure µ of W such that, for some a > 0, then the Kolmogorov's -entropy of C F , the unit ball in reproducing kernel Hilbert spaces of the process W c (·) as a subset of (C([0, 1]), · ∞ ), is of the order 1 c log 2 1 ; for instance, see [66], implying an uniform convergence rate over C F asymptotically equal to Secondly, if we put C F as the closed unit ball of Cameron-Martin spaces associated with the covariance operator of the standard stationary Ornstein-Uhlenbeck process defined by Cov(s, t) = exp(−a|s − t|), a > 0, the Kolmogorov's -entropy of this subset with respect to the norm of the Sobolev for instance, refer to [66], and the convergence rate is Thirdly, it is shown in [67] that any closed ball in a Sobolev space W 1,1 ([0, T]) endowed with the norm L 1 ([0, T]) has a Kolmogorov's -entropy of the order 1 log 1 , implying a uniform convergence rate asymptotically equal to So, it is clear that the uniform convergence rate differs among functional subsets. However, in the nonfunctional case, where all norms are equivalent, the -entropy for any compact subset in IR is of the order log 1 , allowing us to keep the usual convergence rate in the finite-dimensional case, that is ln n nλ K 1 2 , which also coincides with the pointwise convergence rate. In conclusion, unlike the finite-dimensional case, there is a real gap between the pointwise and the uniform convergence in FDA. Thus, treating uniform consistency in the FDA is a challenging question; refer to [9,[68][69][70]. • The effect of the basis function on the Kolmogorov's entropy: Similarly to the previous paragraph, Kolmogorov's entropy is also affected by the spectral decomposition of the functional variable over specific basis functions. Of course, this relationship is justified by the fact that both tools have similar interpretations. In a sense, the Kolmogorov's entropy of a given subset C F is the number of binary bits of information required to describe any x ∈ C F with error , while the spectral decomposition of x ∈ C F given a basis functions ( f i ) can be viewed as the reconstruction of x ∈ C F . Furthermore, the minimal cardinal of elements of the basis ( f i ), sufficient to reconstruct any x ∈ C F with error , is so-called the sampling -entropy of C F ; for instance, see [71]. Theorem 4.1 in this cited work provides, under some general conditions, that the class of bounded piecewise C k smooth functions as a subspace of L 2 has a Kolmogorov's -entropy of the order In contrast, Kolmogorov's -entropy of the class of periodic real analytic functions on [−π, π], as a subspace of C 0 ([−π, π]) is log 2 1 .
As a consequence of this statement (see Corollary 4.2 in [71]), the sampling -entropy of the class spanned by the B-splines basis function is greater than the samplingentropy of the class of periodic real analytic functions that is spanned for the Fourier basis function. In conclusion, in practice, the estimator's accuracy is also affected by the choice of basis functions and the cardinal of the basis used in the metric. • Estimation of the unconditional distribution of the MAR response: One of the fundamental applications of conditional modeling in the MAR structure is the possibility of reconstructing the feature of the MAR variable. In our context, we use the fact that Thus, the natural estimator of the cumulative distribution function is An alternative estimator of F Y (y) can be obtained by replacing only the missing observation with the conditional expectation. Specifically, the second estimator is expressed by We return to [48] for more ideas to construct other estimators. At this stage, the uniform consistency obtained in this paper is an important preliminary tool to derive the root n-consistency of this kind of estimator; we refer to [48] for the regression case.

Concluding Remarks
The convergence rate for a free-distribution functional data analysis is challenging. It requires some advanced tools for functional analysis in pure mathematics. This paper gives several contributions to the existing literature on functional data analysis. First, this paper demonstrated that the Kolmogorov entropy is an essential tool for describing the convergence rate of local linear estimation (LLE). To determine the uniform convergence rate of the LLE of the conditional cumulative distribution function (LLECCDF) and the LLE conditional quantile function (LLECQF), we have used this device. Second, a central limit theorem is established for the proposed estimators. These results are demonstrated under general assumptions that permit coverage of the case of incomplete functional time series. Specifically, we model the dependence using the ergodic assumption and assume that the response variable is missing at random (MAR). Finally, we have evaluated the finite sample performance of the proposed estimators using Monte Carlo simulations. In addition to the previous issues, the present paper opens some important paths for the future. It will be natural to consider in a future investigation of the functional kNN local linear approach quantile regression estimators to obtain an alternative estimator that benefits from the advantages of both methods, the local linear method, and the kNN approach. Extending nonparametric functional concepts to local stationary processes is a relatively underdeveloped field. It would be intriguing to extend our work to the case of the functional local stationary process, which requires nontrivial mathematics; however, doing so would be far beyond the scope of this paper.

Proofs
This section contains the proof of our results. The notation introduced previously will be employed in the following. All of the proofs rely on applying the exponential inequality of the martingale difference. The proofs are quite lengthy; we limit ourselves to the main arguments.

Proof of the Main Theorems
Proof of Theorem 1. For (7), we writē Then, the statement (7) is based on the following decomposition: where . So, all that remains is to demonstrate Lemmas 1-3: Lemma 1. Under Conditions (H1)-(H5), we have, as n → ∞,
ln d n n ψ(λ K ) Proof of Theorem 3. Using the decomposition with Then, the asymptotic normality can be demonstrated via the following Lemmas.

Lemma 6.
Under the assumptions of Theorem 3, we have, as n → ∞,

Proof of the Technical Lemmas
Proof of Lemma 1. Writing The integration by part gives Hence, we obtain This completes the proof.
Proof of Lemma 2. Firstly, observe that, for all x ∈ C F , Next, we let k(x) = arg min k∈{1,2,...,d n } |δ|(x, x k ). We have We start by treating Q 2 . By using the boundeness on Ker(·) and J (·), we write By using the fact that the Ker(·) satisfies the Lipschitz condition, we obtain via (15) from which we infer that Concerning F 2 we put, for l = 0, 1 and k = 1, 2, Therefore, we obtain T k,l ≤ T k,l Making use of the condition (H4), for l = 1, we have So, for k = 2, l = 0, and for k = 1, l = 1, We have This implies that, for k = 1, 2, Thus, for (l, k) = (0, 2), we have and for l = k = 1 Combining (16) and (17), we find that Thus, we obtain Consequently, we infer that Now, we apply an exponential inequality for the difference martingale random variables of Z j that are defined by

Keep in mind that
Therefore, we infer Let CDF We readily obtain IE(S 2 j |F j−1 ) ≤ 2C n 2 λ 4 K ψ j (λ K ). Now, for all ε > 0, we infer that Choosing ε for which C 0 ε 2 = ς, we then infer Since Σ ∞ n=1 d 1−ς n < ∞, we obtain that For Q 3 , we have We follow the proof for Q 1 to obtain which achieves the demonstration of Lemma 2.

Proof of Lemma 3.
Obviously, the first statement is deduced by taking J j = 1 in Lemma 2, while for the second result, we use meaning that there exists x ∈ C F such that So, we have Consequently, Hence, the proof is complete.
Proof of Lemma 4. Notice that So, all that remains is to prove that The proof of (19) follows the same lines as the Lemma 2, while (20) is based on the same idea as in Lemma 2 using the fact that This completes the proof of Lemma 4.
Proof of Lemma 5. We prove that and whereV In order to show (21), we keep the definition of k(x) as in Lemma 2, we use the compactness of the ball B(0, M) in IR 2 , and we write Then, we take j(δ) = arg min j |δ − δ j | . Similarly to Lemma 2 we write V n ( δ j(δ) , x k(x) ) − A n (x k(x) ) − (V n ( δ j(δ) , x k(x) ) −Ā n (x k(x) )) We treat T 1 and T 2 as Q 2 in Lemma 2. Meanwhile, T 4 and T 5 are evaluated as in Q 3 . Finally, we use the idea of Q 1 to evaluate T 3 . The statement (22) is a consequence of (u a Ker b (u)) Ψ(u)du + o(ψ(λ K )).
Hence, the proof is complete.