Early Fault Detection in a Real Scenario of Hybrid Fiber–Coaxial Networks Using Machine Learning: An Approach Based on Decision Trees and Random Forests
Abstract
1. Introduction
2. Bibliographical Analysis
3. Problem Description
4. Conceptual Framework
4.1. HFC Network
4.2. DOCSIS
4.3. Machine Learning
4.4. Decision Trees
4.5. Random Forest
5. Methodology for Early Fault Detection in HFC Networks Using ML
5.1. Business Understanding
5.2. Data Understanding
5.3. Data Preparation
5.4. Modeling
5.5. Evaluation
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- OECD. Percentage of Fibre Connections in Total Broadband. December 2023. Available online: https://www.oecd.org/en/topics/sub-issues/broadband-statistics.html (accessed on 4 October 2024).
- CableLabs. DOCSIS 4.0 MAC and Upper Layer Protocols Interface Specification. Specifications CM-SP-MULPIv4.0, CableLabs. 2023. Available online: https://www.cablelabs.com/specifications/CM-SP-MULPIv4.0 (accessed on 13 May 2024).
- Cisco. Cisco Annual Internet Report (2018–2023) White Paper. Cisco, 2020. Available online: https://www.cisco.com/c/en/us/solutions/collateral/executive-perspectives/annual-internet-report/white-paper-c11-741490.html (accessed on 24 September 2025).
- Ji, R.; Gao, J.; Xie, G.; Flowers, G.T.; Jin, Q. The impact of coaxial connector failures on high frequency signal transmission. In Proceedings of the 2015 IEEE 61st Holm Conference on Electrical Contacts (Holm), San Diego, CA, USA, 11–14 October 2015; pp. 298–303. [Google Scholar] [CrossRef]
- CableLabs. PNM Best Practices Primer: HFC Networks (DOCSIS 3.1). Specifications CM-GL-PNM-3.1-V01-200506, CableLabs. 2024. Available online: https://www.scribd.com/document/563981463/CM-GL-PNM-3-1-V01-200506-3 (accessed on 13 May 2024).
- Williams, T.H.; Hunter, D. Isolating an Upstream Noise Source in a Cable Television Network. U.S. Patent 9,729,257, 8 August 2017. [Google Scholar]
- Mahmoud, H.H.H.; Ismail, T. A review of machine learning use-cases in telecommunication industry in the 5G Era. In Proceedings of the 2020 16th International Computer Engineering Conference (ICENCO), Cairo, Egypt, 29–30 December 2020; pp. 159–163. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority over-Sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Abedin, S.; Ben Ghorbel, M.; Hossain, M.J.; Berscheid, B.; Howlett, C. A Novel Approach for Profile Optimization in DOCSIS 3.1 Networks Exploiting Traffic Information. IEEE Trans. Netw. Serv. Manag. 2019, 16, 578–590. [Google Scholar] [CrossRef]
- Villamar, V.; Rocha, C.; Navarrete, H.; Lupera-Morillo, P. A Predictive Handover Approach in LTE Networks with Measurements and Decision Tree Algorithms (Case Study City of Quito). Rev. Politécnica 2023, 52, 15–24. [Google Scholar] [CrossRef]
- Benhavan, T.; Songwatana, K. HFC network performance monitoring system using DOCSIS cable modem operation data in a 3 dimensional analysis. In Proceedings of the 4th Joint International Conference on Information and Communication Technology, Electronic and Electrical Engineering (JICTEE), Chiang Rai, Thailand, 5–8 March 2014; pp. 1–5. [Google Scholar] [CrossRef]
- Gibellini, E.; Righetti, C.E. Unsupervised Learning for Detection of Leakage from the HFC Network. In Proceedings of the 2018 ITU Kaleidoscope: Machine Learning for a 5G Future (ITU K), Santa Fe, Argentina, 26–28 November 2018; pp. 1–8. [Google Scholar]
- Millicom. Millicom Earnings Release Q1 2024; Technical Report; Millicom: Luxembourg, 2024. [Google Scholar]
- Simakovic, M.; Cica, Z. Detection and localization of failures in hybrid fiber–coaxial network using big data platform. Electronics 2021, 10, 2906. [Google Scholar] [CrossRef]
- Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
- Choudhary, R.; Gianey, H.K. Comprehensive review on supervised machine learning algorithms. In Proceedings of the 2017 International Conference on Machine Learning and Data Science (MLDS), Noida, India, 14–15 December 2017; pp. 37–43. [Google Scholar]
- Dargan, S.; Kumar, M.; Ayyagari, M.R.; Kumar, G. A Survey of Deep Learning and Its Applications: A New Paradigm to Machine Learning. Arch. Comput. Methods Eng. 2020, 27, 1071–1092. [Google Scholar] [CrossRef]
- Lei, Y.; Yang, B.; Jiang, X.; Jia, F.; Li, N.; Nandi, A.K. Applications of machine learning to machine fault diagnosis: A review and roadmap. Mech. Syst. Signal Process. 2020, 138, 106587. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, C.; Chen, W.; Yang, H.; Zhang, M.; Lau, A.P.T. A review of machine learning-based failure management in optical networks. Sci. China Inf. Sci. 2022, 65, 211302. [Google Scholar] [CrossRef]
- Kotsiantis, S.B.; Zaharakis, I.; Pintelas, P. Supervised machine learning: A review of classification techniques. Emerg. Artif. Intell. Appl. Comput. Eng. 2007, 160, 3–24. [Google Scholar]
- Arulkumaran, K.; Deisenroth, M.P.; Brundage, M.; Bharath, A.A. Deep Reinforcement Learning: A Brief Survey. IEEE Signal Process. Mag. 2017, 34, 26–38. [Google Scholar] [CrossRef]
- Oshiro, T.M.; Perez, P.S.; Baranauskas, J.A. How many trees in a random forest? In Machine Learning and Data Mining in Pattern Recognition, Proceedings of the 8th International Conference, MLDM 2012, Berlin, Germany, 13–20 July 2012; Proceedings 8; Springer: Berlin/Heidelberg, Germany, 2012; pp. 154–168. [Google Scholar]
- Elkabalawy, M.; Al-Sakkaf, A.; Abdelkader, E.M.; Alfalah, G. CRISP-DM-Based Data-Driven Approach for Building Energy Prediction Utilizing Indoor and Environmental Factors. Sustainability 2024, 16, 7249. [Google Scholar] [CrossRef]
- Woundy, R.; Marez, K. Cable Device Management Information Base for Data-Over-Cable Service Interface Specification (DOCSIS) Compliant Cable Modems and Cable Modem Termination Systems. RFC 4639, Internet Engineering Task Force (IETF), December 2006. Available online: https://www.rfc-editor.org/info/rfc4639 (accessed on 24 September 2025).
- Matharaarachchi, S.; Domaratzki, M.; Muthukumarana, S. Enhancing SMOTE for imbalanced data with abnormal minority instances. Mach. Learn. Appl. 2024, 18, 100597. [Google Scholar] [CrossRef]
- Williams, C.K.I. The Effect of Class Imbalance on Precision-Recall Curves. Neural Comput. 2021, 33, 853–857. [Google Scholar] [CrossRef]
- Chicco, D.; Tötsch, N.; Jurman, G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Mining 2021, 14, 13. [Google Scholar] [CrossRef]
- Chicco, D.; Jurman, G. A statistical comparison between Matthews correlation coefficient (MCC), prevalence threshold, and Fowlkes–Mallows index. J. Biomed. Inform. 2023, 132, 104426. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Szcerba, C.; Dávalos, E.; Leiva, A.; Pinto-Ríos, J. Early Fault Detection in a Real Scenario of Hybrid Fiber–Coaxial Networks Using Machine Learning: An Approach Based on Decision Trees and Random Forests. Appl. Sci. 2025, 15, 10442. https://doi.org/10.3390/app151910442
Szcerba C, Dávalos E, Leiva A, Pinto-Ríos J. Early Fault Detection in a Real Scenario of Hybrid Fiber–Coaxial Networks Using Machine Learning: An Approach Based on Decision Trees and Random Forests. Applied Sciences. 2025; 15(19):10442. https://doi.org/10.3390/app151910442
Chicago/Turabian StyleSzcerba, Christian, Enrique Dávalos, Ariel Leiva, and Juan Pinto-Ríos. 2025. "Early Fault Detection in a Real Scenario of Hybrid Fiber–Coaxial Networks Using Machine Learning: An Approach Based on Decision Trees and Random Forests" Applied Sciences 15, no. 19: 10442. https://doi.org/10.3390/app151910442
APA StyleSzcerba, C., Dávalos, E., Leiva, A., & Pinto-Ríos, J. (2025). Early Fault Detection in a Real Scenario of Hybrid Fiber–Coaxial Networks Using Machine Learning: An Approach Based on Decision Trees and Random Forests. Applied Sciences, 15(19), 10442. https://doi.org/10.3390/app151910442