Data-Driven Modeling of Web Traffic Flow Using Functional Modal Regression
Abstract
1. Introduction
2. FTSDA Framework and the Local Linear Modal Regression
Algorithm
- Compute
- Fit local linear quantile regressions over a coarse grid in p;
- Estimate by
- Select where is smallest;
- Output
3. Mathematical Support
- (T1)
- For any , , and there exists a function such that
- (T2)
- The function is of class , and satisfies the following Lipschitz condition:where denotes a neighborhood of , and is the conditional cumulative function of given .
- (T3)
- The sequence satisfies and
- (T4)
- The kernel is a positive and differentiable function, which is supported within , and such thatis a positive definite matrix.
- (T5)
- The bandwidth satisfies the following: , such that
4. Web Traffic Flow Modeling
4.1. Test of the Data-Driven Approach over Artificial Data
4.2. Real Data Application
5. Discussion and Conclusions
- Contribution and Positioning:We point out the novelty of this contribution is that it is the first paper that combines three important tools of statistical modeling. It combines the robust estimation, the local linear smoothing approach and the functional time series structure. Compared to existing works in functional data analysis, the gain is substantial in both theoretical and practical aspects. From the theoretical point of view, the proposed conditional mode estimator substantially improves the kernel estimator used by [5]. Indeed, the kernel estimator of [5] is obtained by maximizing the conditional density function, while the proposed estimator is related to the conditional quantile. This consideration improves the robustness and the accuracy of the estimator. It is well documented that the local linear approach reduces the bias term of the kernel method.Additionally, for the practical point of view, the functional time series structure studied in the present contribution is more general than the linear process proposed by [4], in the sense that the strong mixing assumption is also fulfilled for many nonlinear processes, thus considering the mixing structure of functional time series, allowing to cover a large class of functional time series data (linear or nonlinear process).At this stage, the web traffic flow is usually influenced by many factors such as time of day, special events, promotions, and user interest. The linearity of this kind of data is not often guaranteed. Therefore, linear models may provide inaccurate forecasts, especially when the data show strong fluctuations or contains outliers. Consequently, more flexible nonlinear models are often more appropriate to represent the dynamic nature of web traffic flow.
- Connection Between the Estimator and Web Traffic Data:We recall that the main feature of the proposed algorithm is its ability to combine three fundamental components in mathematical statistics, which are functional data modeling, the local linear approach and modal regression as a robust predictor. Combining these tools permits to provide an effective and comprehensive algorithm for modeling web traffic flow. Functional data modeling treats the entire traffic curve as a single observation, allowing the model to explore the temporal dependence and smooth variations in web activity over time. The local linear approach improves estimation accuracy by controlling the local behavior of both the model and the data through linear approximation in the neighborhood of location point. This local adaptation is especially useful during peak hours or special events when traffic change rapidly. The robust estimation component increases the model’s resistance to outliers caused by unusual numbers of visits, viral content, or server errors, ensuring more stable predictions. Conditional mode prediction focuses on estimating the most probable future value rather than the average, which is particularly important when the web traffic distribution is skewed or contains extreme values. Overall, the local linear estimation of the robust mode is well adequate for high-frequency web traffic data. It improves prediction accuracy compared with standard linear models, which often fail to fit the nonlinearity and highly variability nature of web traffic data.
- Conclusions:In this work we have introduced a new predictor based on the estimation of the -modal regression using a local linear approach. The theoretical part provides mathematical support for implementing the estimator in practice. Specifically, we have established the almost complete consistency under a strong mixing dependency assumption, which serves as an alternative to the conventional correlation-based criteria. Empirical results from both simulated and real datasets, including web traffic data, confirm that the feasibility of the proposed estimator is closely linked to different parameters involved in the estimator. Even if combining the framework with a local linear approach improves both robustness and predictive accuracy, especially for complex functional data like web traffic curves, the accuracy of the estimator depends on the strength of dependencies in the data, the smoothness of the underlying nonparametric model, and the careful choice of the bandwidth or semi-metric. Selecting these parameters can be challenging, as inappropriate choices may substantially affect the estimator’s robustness and predictive performance.In addition to these findings, this study highlights several potential directions for future research. The first prospect is the asymptotic distribution of the normalized estimator under various forms of functional strong mixing, such as association or Markovian sequences. Another important extension concerns spatio-functional modeling, which takes into account the geographic coordinates of the data. Although these extensions focus on dependencies in the data, further generalizations to other smoothing techniques—including kNN methods, and semi-partial linear approaches—are also interesting topics for the future.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Proofs of the Asymptotic Result
Appendix B. Notation Box and Acronyms List
| Functional Data Analysis | FDA |
| Functional Time Series Data Analysis | FTSDA |
| Nadaraya–Watson Estimator | NW |
| Local Linear Estimation Method | LLEM |
| Local Constant Approach | LCA |
| Conditional Mode | CM |
| Conditional Cumulative function | CC |
| Conditional Density Function | CD |
| The Kernel Function | |
| The Functional Bandwidth Sequence | |
| The Locating Function | M |
| The Conditional Mode Estimator | |
| The Conditional Quantile Estimator | |
| The Estimator of the Derivative of the Conditional Quantile | |
| The Bandwidth Sequence of the Derivative |
References
- Li, J.; Li, J.; Jia, N.; Li, X.; Ma, W.; Shi, S. GeoTraPredict: A machine learning system of web spatio-temporal traffic flow. Neurocomputing 2021, 428, 317–324. [Google Scholar] [CrossRef]
- Park, D.-C. Structure optimization of BiLinear Recurrent Neural Networks and its application to Ethernet network traffic prediction. Inf. Sci. 2013, 237, 18–28. [Google Scholar] [CrossRef]
- Shelatkar, T.; Tondale, S.; Yadav, S.; Ahir, S. Web Traffic Time Series Forecasting using ARIMA and LSTM RNN. In Proceedings of the 2020 International Conference on Data Science and Engineering, Mumbai, India, 12–14 August 2020. [Google Scholar]
- Bosq, D. Linear Processes in Function Spaces; Lecture Notes in Statistics; Springer: New York, NY, USA, 2000; Volume 149. [Google Scholar]
- Ferraty, F.; Vieu, P. Nonparametric Functional Data Analysis. Theory and Practice; Springer Series in Statistics; Springer: New York, NY, USA, 2006. [Google Scholar]
- Masry, E. Nonparametric regression estimation for dependent functional data: Asymptotic normality. Stochastic Process. Appl. 2005, 115, 155–177. [Google Scholar] [CrossRef]
- Ling, N.; Vieu, P. Nonparametric modelling for functional data: Selected survey and tracks for future. Statistics 2018, 52, 934–949. [Google Scholar] [CrossRef]
- Bouzebda, S.; Laksaci, A.; Mohammedi, M. Single index regression model for functional quasi-associated time series data. REVSTAT 2022, 20, 605–631. [Google Scholar]
- Wang, L. Nearest neighbors estimation for long memory functional data. Stat. Methods Appl. 2020, 29, 709–725. [Google Scholar] [CrossRef]
- Collomb, G.; Härdle, W.; Hassani, S. A note on prediction via estimation of the conditional mode function. J. Stat. Plan. Inference 1986, 15, 227–236. [Google Scholar] [CrossRef]
- Dabo-Niang, S.; Kaid, Z.; Laksaci, A. Spatial conditional quantile regression: Weak consistency of a kernel estimate. Rev. Roum. Math. Pures Appl. 2012, 57, 311–339. [Google Scholar]
- Bouzebda, S.; Didi, S. Some results about kernel estimators for function derivatives based on stationary and ergodic continuous time processes with applications. Commun. Stat. Theory Methods 2022, 51, 3886–3933. [Google Scholar] [CrossRef]
- Dabo-Niang, S.; Kaid, Z.; Laksaci, A. Asymptotic properties of the kernel estimate of spatial conditional mode when the regressor is functional. AStA Adv. Stat. Anal. 2015, 99, 131–160. [Google Scholar] [CrossRef]
- Ezzahrioui, M.H.; Ould-Saïd, E. Asymptotic normality of a nonparametric estimator of the conditional mode function for functional data. J. Nonparametr. Stat. 2008, 20, 3–18. [Google Scholar] [CrossRef]
- Xu, Y. Functional Data Analysis. In Springer Handbook of Engineering Statistics; Pham, H., Ed.; Springer: London, UK, 2023; pp. 67–85. [Google Scholar]
- Cardot, H.; Crambes, C.; Sarda, P. Quantile regression when the covariates are functions. J. Nonparametr. Stat. 2005, 17, 841–856. [Google Scholar] [CrossRef]
- Wang, H.; Ma, Y. Optimal subsampling for quantile regression in big data. Biometrika 2021, 108, 99–112. [Google Scholar] [CrossRef]
- Dabo-Niang, S.; Kaid, Z.; Laksaci, A. On spatial conditional mode estimation for a functional regressor. Stat. Probab. Lett. 2012, 82, 1413–1421. [Google Scholar] [CrossRef]
- Dabana, H.; Agbokou, K.; Gneyou, K. Local linear estimation of conditional probability density and mode under right censoring and left truncation: Dependent data case. Gulf J. Math. 2025, 20, 338–359. [Google Scholar] [CrossRef]
- Azzi, A.; Laksaci, A.; Ould-Saïd, E. On the robustification of the kernel estimator of the functional modal regression. Stat. Probab. Lett. 2021, 181, 109256. [Google Scholar] [CrossRef]
- Azzi, A.; Belguerna, A.; Laksaci, A.; Rachdi, M. The scalar-on-function modal regression for functional time series data. J. Nonparametr. Stat. 2024, 36, 503–526. [Google Scholar] [CrossRef]
- Alamari, M.B.; Almulhim, F.A.; Almanjahie, I.M.; Bouzebda, S.; Laksaci, A. Scalar-on-Function Mode Estimation Using Entropy and Ergodic Properties of Functional Time Series Data. Entropy 2025, 27, 552. [Google Scholar] [CrossRef]
- Fan, J. Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66; Routledge: Abingdon-on-Thames, UK, 2018. [Google Scholar]
- Rachdi, M.; Laksaci, A.; Demongeot, J.; Abdali, A.; Madani, F. Theoretical and practical aspects of the quadratic error in the local linear estimation of the conditional density for functional data. Comput. Stat. Data Anal. 2014, 73, 53–68. [Google Scholar] [CrossRef]
- Baíllo, A.; Grané, A. Local linear regression for functional predictor and scalar response. J. Multivar. Anal. 2009, 100, 102–111. [Google Scholar] [CrossRef]
- Barrientos-Marin, J.; Ferraty, F.; Vieu, P. Locally modelled regression and functional data. J. Nonparametr. Stat. 2010, 22, 617–632. [Google Scholar] [CrossRef]
- Berlinet, A.; Elamine, A.; Mas, A. Local linear regression for functional data. Ann. Inst. Stat. Math. 2011, 63, 1047–1075. [Google Scholar] [CrossRef]
- Demongeot, J.; Laksaci, A.; Madani, F.; Rachdi, M. Functional data: Local linear estimation of the conditional density and its application. Statistics 2013, 47, 26–44. [Google Scholar] [CrossRef]
- Laksaci, A.; Ould Saïd, E.; Rachdi, M. Uniform consistency in number of neighbors of the k NN estimator of the conditional quantile model. Metrika 2021, 84, 895–911. [Google Scholar] [CrossRef]
- Almulhim, F.A.; Alamari, N.B.; Laksaci, A.; Kaid, Z. Modal Regression Estimation by Local Linear Approach in High-Dimensional Data Case. Axioms 2025, 14, 537. [Google Scholar] [CrossRef]
- Jones, D.A. Nonlinear autoregressive processes. Proc. R. Soc. Lond. A 1978, 360, 71–95. [Google Scholar]
- Ozaki, T. Nonlinear Time Series Models for Nonlinear Random Vibrations; Technical Report; University of Manchester: Manchester, UK, 1979. [Google Scholar]
- Engle, R.F. Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 1982, 50, 987–1007. [Google Scholar] [CrossRef]
- Al-Awadhi, F.A.; Kaid, Z.; Laksaci, A.; Ouassou, I.; Rachdi, M. Functional data analysis: Local linear estimation of the L1-conditional quantiles. Stat. Methods Appl. 2019, 28, 217–240. [Google Scholar] [CrossRef]
- Koumar, J.; Hynek, K.; Čejka, T.; Šiška, P. CESNET-TimeSeries24: Time Series Dataset for Network Traffic Anomaly Detection and Forecasting. Sci. Data 2025, 12, 338. [Google Scholar] [CrossRef]




| Multiplicative Factor MV | r | Sample Size n | MSE |
|---|---|---|---|
| MF = 1 | 50 | 0.94 | |
| 50 | 0.72 | ||
| 50 | 0.64 | ||
| 100 | 0.71 | ||
| 100 | 0.56 | ||
| 100 | 0.47 | ||
| 250 | 0.12 | ||
| 250 | 0.32 | ||
| 250 | 0.58 | ||
| MF = 5 | 50 | 1.08 | |
| 50 | 0.92 | ||
| 50 | 0.80 | ||
| 100 | 0.92 | ||
| 100 | 0.81 | ||
| 100 | 0.63 | ||
| 250 | 0.44 | ||
| 250 | 0.51 | ||
| 250 | 0.53 | ||
| MF = 10 | 50 | 1.13 | |
| 50 | 1.04 | ||
| 50 | 0.93 | ||
| 100 | 0.92 | ||
| 100 | 0.88 | ||
| 100 | 0.75 | ||
| 250 | 0.63 | ||
| 250 | 0.70 | ||
| 250 | 0.74 |
| Future Time Horizon h | m | Selector Rule | Locating Function | MSE |
|---|---|---|---|---|
| h = 0.5 T | Rule 1 | 0.21 | ||
| 0.24 | ||||
| Rule 2 | 0.28 | |||
| 0.32 | ||||
| Rule 3 | 0.15 | |||
| 0.22 | ||||
| Rule 1 | 0.43 | |||
| 0.52 | ||||
| Rule 2 | 0.43 | |||
| 0.63 | ||||
| Rule 3 | 0.35 | |||
| 0.41 | ||||
| Rule 1 | 0.66 | |||
| 0.71 | ||||
| Rule 2 | 0.79 | |||
| 0.68 | ||||
| Rule 3 | 0.57 | |||
| 0.69 | ||||
| h = 2 T | Rule 1 | 0.93 | ||
| 0.96 | ||||
| Rule 2 | 1.05 | |||
| 0.32 | ||||
| Rule 3 | 0.87 | |||
| 0.84 | ||||
| Rule 1 | 1.11 | |||
| 1.23 | ||||
| Rule 2 | 1.22 | |||
| 1.45 | ||||
| Rule 3 | 1.35 | |||
| 1.21 | ||||
| Rule 1 | 1.33 | |||
| 1.41 | ||||
| Rule 2 | 1.39 | |||
| 1.36 | ||||
| Rule 3 | 1.35 | |||
| 1.42 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kaid, Z.; Alamari, M.B. Data-Driven Modeling of Web Traffic Flow Using Functional Modal Regression. Axioms 2025, 14, 815. https://doi.org/10.3390/axioms14110815
Kaid Z, Alamari MB. Data-Driven Modeling of Web Traffic Flow Using Functional Modal Regression. Axioms. 2025; 14(11):815. https://doi.org/10.3390/axioms14110815
Chicago/Turabian StyleKaid, Zoulikha, and Mohammed B. Alamari. 2025. "Data-Driven Modeling of Web Traffic Flow Using Functional Modal Regression" Axioms 14, no. 11: 815. https://doi.org/10.3390/axioms14110815
APA StyleKaid, Z., & Alamari, M. B. (2025). Data-Driven Modeling of Web Traffic Flow Using Functional Modal Regression. Axioms, 14(11), 815. https://doi.org/10.3390/axioms14110815

