Seamless User-Generated Content Processing for Smart Media: Delivering QoE-Aware Live Media with YOLO-Based Bib Number Recognition
Abstract
1. Introduction
- Transformation of spectator roles: Exploring and proving the concept of how spectators can be transformed from passive consumers into active co-creators of event narratives by seamlessly integrating UGC into a professional live production workflow.
- Feasibility of remote production: Examining the feasibility of remote production and the combined use of 5G and WiFi connectivity to ensure resilience, scalability, and high-quality service delivery in demanding live-event environments.
- Validation of automated Quality of Experience (QoE) management: Demonstrating the robust performance of a fully automated workflow for live media production that integrates real-time QoE monitoring and analysis.
- Assessment of AI-enhanced distributed media: Assessing the technical robustness and functional performance of AI-based media enrichment (runner/bib number detection) within a live, distributed workflow.
- Living Labs methodology validation: Validating the Living Labs methodology within the NEMO framework to assess the social and business potential of a hybrid media ecosystem enabled by participatory and AI-driven orchestration.
- Hybrid real-time media pipeline: Implementation of an end-to-end distributed media pipeline that supports a hybrid content model with real-time semantic enrichment.
- Integrated cognitive services: The seamless integration of advanced cognitive services into the live pipeline.
- Novel cloud-native architecture: The deployment of a cloud-native architecture based on Kubernetes to enable dynamic and scalable integration of heterogeneous sources (smartphone streams and remote production tools) into a unified live media workflow.
- Meta-OS orchestration: Leveraging the NEMO framework to provide dynamic resource management and workload migration; autonomous scalability for AI/ML nodes based on predefined Service Level Objectives (SLOs); and advanced network management features to ensure bandwidth requirements and low latency.
2. Background and Related Work
2.1. Smart Media and Immersive Event Coverage in the Literature
2.2. Edge Computing, AI/ML for Media Annotation, and UGC Orchestration
3. Smart Media City Use Case
3.1. Use Case Description
- Resources management and migration of workloads across the continuum: The NEMO Meta-Orchestrator (MO) dynamically manages and migrates workloads optimizing the placement of demanding tasks (like runner bib detection) based on real-time resource availability and network conditions.
- Scalability in deploying additional AI/ML nodes on the cloud: This plane monitors defined SLOs and uses AI-driven logic in the NEMO Cybersecure Federated Deep Reinforcement Learning (CF-DRL) to trigger autonomous scaling decisions to the MO.
- Advanced network management to ensure bandwidth requirements: In this plane, the NEMO meta Network Cluster Controller (mNCC) integrates network features like monitoring and ensures bandwidth requirements are met and there is low latency.
- A secure execution environment: This not only manages access but enforces attested integrity and secure communication for every distributed component.
3.2. System Components and Functional Roles
3.2.1. Race Stream and Spectator App
3.2.2. Media Gateway and Delivery Manager
3.2.3. Media Production Engine and Production Control
3.2.4. AI Engine
3.2.5. Video Quality Probe
- Stream metadata, including path, timestamp, image resolution (width and height), frame rate, and total number of frames.
- Video quality indicators, such as spatial activity, temporal activity, blur, blockiness, block loss, exposure, contrast, interlace artifacts, noise, and flickering, offering a comprehensive assessment of both spatial and temporal impairments. Additional flags (e.g., letterbox, pillarbox, freezing, blackout) indicate the presence or absence of specific visual artifacts.
- Audio quality indicators, covering average and peak audio volume.
- Binary alert flags for both video and audio, highlighting whether predefined thresholds were exceeded for specific metrics, including blur, blockiness, block loss, freezing, uniform frame, black frame, no audio, and silence.
- Predicted quality value (MOS value), providing an estimate of perceived quality for the analyzed video interval.
4. Implementation and Deployment
4.1. Technical Setup
4.2. Experimental Setup
5. Results and Validation
5.1. Captured Streams and Video Quality Assessment
5.2. Bib Number Detection and Recognition Results
6. Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
APN | Access Point Name |
CI/CD | Continuous Integration/Continuous Delivery |
CPU | Central Processing Unit |
CF-DRL | Cybersecure Federated Deep Reinforcement Learning |
GPS | Global Positioning System |
KPI | Key Performance Indicator |
MO | Meta-Orchestrator |
Meta-OS | Meta Operating System |
ML | Machine Learning |
mNCC | Meta Network Cluster Controller |
MOS | Mean Opinion Score |
NEMO | Next Generation Meta Operating System |
PCA | Principal Component Analysis |
QoE | Quality of Experience |
QoS | Quality of Service |
RTMP | Real-Time Messaging Protocol |
SLOs | Service Level Objectives |
UDP | User Datagram Protocol |
UGC | User-Generated Content |
VPN | Virtual Private Network |
YOLO | You Only Look Once |
References
- Ericsson. The Latest Social Media Trend: Live Streaming. 2016. Available online: https://www.ericsson.com/en/reports-and-papers/mobility-report/articles/latest-social-media-trend-live-streaming (accessed on 16 September 2025).
- Nokia. How 5G Will Transform Live Events. 2023. Available online: https://www.nokia.com/thought-leadership/articles/how-5g-will-transform-live-events/ (accessed on 16 September 2025).
- Ericsson. 5G: Meeting Consumer Demands at Big Events. Available online: https://www.ericsson.com/en/reports-and-papers/consumerlab/reports/5g-meeting-consumer-demands-at-big-events (accessed on 16 September 2025).
- Zhang, Y.; Wang, J.; Zhu, Y.; Xie, R. Subjective and objective quality evaluation of UGC video after encoding and decoding. Displays 2024, 83, 102719. [Google Scholar] [CrossRef]
- Orive, A.; Agirre, A.; Truong, H.L.; Sarachaga, I.; Marcos, M. Quality of Service Aware Orchestration for Cloud–Edge Continuum Applications. Sensors 2022, 22, 1755. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Deng, G.; Bai, C.; Yang, J.; Wang, G.; Zhang, H.; Bai, J.; Yuan, H.; Xu, M.; Wang, S. Demystifying the QoS and QoE of Edge-hosted Video Streaming Applications in the Wild with SNESet. Proc. ACM Manag. Data 2023, 1, 236. [Google Scholar] [CrossRef]
- Ericsson. 5G Elevates Connectivity at 2024’s Biggest Events. 2024. Available online: https://www.ericsson.com/en/press-releases/3/2024/consumerlab-5g-elevates-connectivity-experiences (accessed on 16 September 2025).
- Comission, E. The Next Generation Internet of Things|Shaping Europe’s Digital Future. Available online: https://digital-strategy.ec.europa.eu/en/policies/next-generation-internet-things (accessed on 16 September 2025).
- Chochliouros, I.P.; Pages-Montanera, E.; Alcázar-Fernández, A.; Zahariadis, T.; Velivassaki, T.H.; Skianis, C.; Rossini, R.; Belesioti, M.; Drosos, N.; Bakiris, E.; et al. NEMO: Building the Next Generation Meta Operating System. In Proceedings of the 3rd Eclipse Security, AI, Architecture and Modelling Conference on Cloud to Edge Continuum, eSAAM ’23, Ludwigsburg, Germany, 17 October 2023; pp. 1–9. [Google Scholar] [CrossRef]
- Belesioti, M.; Chochliouros, I.P.; Dimas, P.; Sofianopoulos, M.; Zahariadis, T.; Skianis, C.; Montanera, E.P. Putting Intelligence into Things: An Overview of Current Architectures. In Proceedings of the AIAI 2023 IFIP WG 12.5 International Workshops on Artificial Intelligence Applications and Innovations, León, Spain, 14–17 June 2023; Springer: Cham, Switzerland, 2023; pp. 106–117. [Google Scholar]
- Segou, O.; Skias, D.S.; Velivassaki, T.H.; Zahariadis, T.; Pages, E.; Ramiro, R.; Rossini, R.; Karkazis, P.A.; Muniz, A.; Contreras, L.; et al. NExt generation Meta Operating systems (NEMO) and Data Space: Envisioning the future. In Proceedings of the 4th Eclipse Security, AI, Architecture and Modelling Conference on Data Space, eSAAM ’24, Mainz, Germany, 22–24 October 2024; pp. 41–49. [Google Scholar] [CrossRef]
- Chen, J.; Jung, S.; Cai, L. A critical review of technology-facilitated event engagement: Current landscape and pathway forward. Int. J. Contemp. Hosp. Manag. 2025, 37, 169–189. [Google Scholar] [CrossRef]
- Chang, S.; Suh, J. The Impact of Digital Storytelling on Presence, Immersion, Enjoyment, and Continued Usage Intention in VR-Based Museum Exhibitions. Sensors 2025, 25, 2914. [Google Scholar] [CrossRef] [PubMed]
- Hu, M.; Luo, Z.; Pasdar, A.; Lee, Y.C.; Zhou, Y.; Wu, D. Edge-Based Video Analytics: A Survey. arXiv 2023, arXiv:2303.14329. [Google Scholar] [CrossRef]
- Jiang, X.; Yu, F.R.; Song, T.; Leung, V.C.M. A Survey on Multi-Access Edge Computing Applied to Video Streaming: Some Research Issues and Challenges. IEEE Commun. Surv. Tutor. 2021, 23, 871–903. [Google Scholar] [CrossRef]
- Bisicchia, G.; Forti, S.; Pimentel, E.; Brogi, A. Continuous QoS-compliant orchestration in the Cloud-Edge continuum. Software Pract. Exp. 2024, 54, 2191–2213. [Google Scholar] [CrossRef]
- Sodagi, S.; Mangalwede, S.; Hariharan, S. Race AI: A Deep Learning Approach to Marathon Bib Detection and Recognition. In Proceedings of the 2025 4th International Conference on Advances in Computing, Communication, Embedded and Secure Systems (ACCESS), Ernakulam, India, 11–13 June 2025; pp. 752–757. [Google Scholar] [CrossRef]
- Keltsch, M.; Prokesch, S.; Gordo, O.P.; Serrano, J.; Phan, T.K.; Fritzsch, I. Remote Production and Mobile Contribution Over 5G Networks: Scenarios, Requirements and Approaches for Broadcast Quality Media Streaming. In Proceedings of the 2018 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Valencia, Spain, 6–8 June 2018; pp. 1–7. [Google Scholar] [CrossRef]
- Anitha, R.; Esther Jyothi, V.; Regulagadda, R.; Macherla, S.; Naga Malleswari, D.; Siva Nageswara Rao, G.; Nagesh, P. Cloud Computing and Multimedia IoT. In Multimedia Technologies in the Internet of Things Environment; Springer Nature: Singapore, 2025; Volume 4, pp. 227–242. [Google Scholar] [CrossRef]
- Liu, S.; Wang, S.; Ye, F.; Wu, Q. Cloud-Edge Collaborative Transcoding for Adaptive Video Streaming: Enhancing QoE in Wireless Networks. IEEE Trans. Green Commun. Netw. 2025. [Google Scholar] [CrossRef]
- Terven, J.; Córdova-Esparza, D.M.; Romero-González, J.A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Nasrabadi, M.A.; Beauregard, Y.; Ekhlassi, A. The implication of user-generated content in new product development process: A systematic literature review and future research agenda. Technol. Forecast. Soc. Change 2024, 206, 123551. [Google Scholar] [CrossRef]
- Ettrich, O.; Stahlmann, S.; Leopold, H.; Barrot, C. Automatically identifying customer needs in user-generated content using token classification. Decis. Support Syst. 2024, 178, 114107. [Google Scholar] [CrossRef]
- Zhao, H.; Tang, Z.; Li, Z.; Dong, Y.; Si, Y.; Lu, M.; Panoutsos, G. Real-Time Object Detection and Robotic Manipulation for Agriculture Using a YOLO-Based Learning Approach. In Proceedings of the 2024 IEEE International Conference on Industrial Technology (ICIT), Bristol, UK, 25–27 March 2024; pp. 1–6. [Google Scholar] [CrossRef]
- Yue, S.; Zhang, Z.; Shi, Y.; Cai, Y. WGS-YOLO: A real-time object detector based on YOLO framework for autonomous driving. Comput. Vis. Image Underst. 2024, 249, 104200. [Google Scholar] [CrossRef]
- Midoglu, C.; Sabet, S.S.; Sarkhoosh, M.H.; Majidi, M.; Gautam, S.; Solberg, H.M.; Kupka, T.; Halvorsen, P. AI-Based Sports Highlight Generation for Social Media. In Proceedings of the 3rd Mile-High Video Conference, MHV ’24, Denver, CO, USA, 11–14 February 2024; pp. 7–13. [Google Scholar] [CrossRef]
- Vamsikeshwaran, M. AI Powered Video Content Moderation Governed by Intensity Based Custom Rules with Remedial Pipelines. In Proceedings of the 2024 International Conference on Computer Vision and Image Processing, Chennai, India, 19–21 December 2024; Springer: Cham, Switzerland, 2024; pp. 390–403. [Google Scholar]
- Bai, T.; Zhao, H.; Huang, L.; Wang, Z.; Kim, D.I.; Nallanathan, A. A Decade of Video Analytics at Edge: Training, Deployment, Orchestration, and Platforms. IEEE Commun. Surv. Tutor. 2025. [Google Scholar] [CrossRef]
- Ortiz-Arce, S.; Llorente, A.; Rio, A.D.; Alvarez, F. An Enhanced Method for Objective QoE Analysis in Adaptive Streaming Services. IEEE Access 2025, 13, 159273–159285. [Google Scholar] [CrossRef]
- Martinez, R.; Llorente, A.; del Rio, A.; Serrano, J.; Jimenez, D. Performance Evaluation of YOLOv8-Based Bib Number Detection in Media Streaming Race. IEEE Trans. Broadcast. 2024, 70, 1126–1138. [Google Scholar] [CrossRef]
- Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8. 2023. Available online: https://docs.ultralytics.com/es/models/yolov8/#yolov8-usage-examples (accessed on 30 April 2025).
- Ben-Ami, I.; Basha, T.; Avidan, S. Racing Bib Numbers Recognition. In Proceedings of the BMVC 2012, Guildford, UK, 3–7 September 2012; pp. 1–10. [Google Scholar]
- Hernandez-Carrascosa, P.; Penate-Sanchez, A.; Lorenzo-Navarro, J.; Freire-Obregon, D.; Castrillon-Santana, M. TGCRBNW: A Dataset for Runner Bib Number Detection (and Recognition) in the Wild. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 9445–9451. [Google Scholar] [CrossRef]
- HCMUS. Bib Detection Big Data Dataset. 2023. Available online: https://universe.roboflow.com/hcmus-3p8wh/bib-detection-big-data (accessed on 12 February 2024).
- Netzer, Y.; Wang, T.; Coates, A.; Bissacco, A.; Wu, B.; Ng, A.Y. Reading Digits in Natural Images with Unsupervised Feature Learning. In Proceedings of the NIPS 2011 Workshop on Deep Learning and Unsupervised Feature Learning, Granada, Spain, 16–17 December 2011. [Google Scholar]
- Recommendation ITU-R BT.500-11: Methodology for the Subjective Assessment of the Quality of Television Pictures; ITU: Geneva, Switzerland, 2002.
- Leszczuk, M.; Hanusiak, M.; Farias, M.C.Q.; Wyckens, E.; Heston, G. Recent developments in visual quality monitoring by key performance indicators. Multimed. Tools Appl. 2016, 75, 10745–10767. [Google Scholar] [CrossRef]
- del Rio, A.; Serrano, J.; Jimenez, D.; Contreras, L.M.; Alvarez, F. Multisite gaming streaming optimization over virtualized 5G environment using Deep Reinforcement Learning techniques. Comput. Netw. 2024, 244, 110334. [Google Scholar] [CrossRef]
- Red Hat. What is GitOps? Available online: https://www.redhat.com/en/topics/devops/what-is-gitops (accessed on 16 September 2025).
- European Broadcasting Union (EBU). EBU—Recommendation R132: Signal Quality in HDTV Production and Broadcast Services; Technical report, Guidelines for technical, operational & creative staff on how to achieve and maintain sufficient technical quality along the production chain; European Broadcasting Union: Geneva, Switzerland, 2011. [Google Scholar]
Metric | Mean | Median | Min | Max | Std. Dev. |
---|---|---|---|---|---|
Mos | 3.27 | 3.29 | 2.55 | 3.85 | 0.34 |
Blockiness | 0.95 | 0.95 | 0.92 | 0.98 | 0.02 |
Blur | 3.63 | 3.63 | 3.10 | 4.22 | 0.28 |
Block loss | 3.25 | 2.65 | 0.73 | 8.78 | 1.92 |
Temporal activity | 23.93 | 22.79 | 7.15 | 42.37 | 8.07 |
Spatial activity | 96.30 | 95.17 | 78.18 | 121.67 | 8.69 |
Exposure | 128.89 | 129.04 | 123.35 | 131.72 | 1.89 |
Contrast | 44.81 | 46.46 | 36.11 | 51.70 | 4.78 |
Noise | 0.91 | 0.73 | 0.39 | 3.32 | 0.52 |
Metric | Value |
---|---|
True Positives (TPs) | 221 |
False Positives (FPs) | 15 |
False Negatives (FNs) | 54 |
Precision | 0.9364 |
Recall | 0.8036 |
F1-Score | 0.8650 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
del Rio, A.; Llorente, Á.; Ortiz-Arce, S.; Belesioti, M.; Pappas, G.; Muñiz, A.; Contreras, L.M.; Christopoulos, D. Seamless User-Generated Content Processing for Smart Media: Delivering QoE-Aware Live Media with YOLO-Based Bib Number Recognition. Electronics 2025, 14, 4115. https://doi.org/10.3390/electronics14204115
del Rio A, Llorente Á, Ortiz-Arce S, Belesioti M, Pappas G, Muñiz A, Contreras LM, Christopoulos D. Seamless User-Generated Content Processing for Smart Media: Delivering QoE-Aware Live Media with YOLO-Based Bib Number Recognition. Electronics. 2025; 14(20):4115. https://doi.org/10.3390/electronics14204115
Chicago/Turabian Styledel Rio, Alberto, Álvaro Llorente, Sofia Ortiz-Arce, Maria Belesioti, George Pappas, Alejandro Muñiz, Luis M. Contreras, and Dimitris Christopoulos. 2025. "Seamless User-Generated Content Processing for Smart Media: Delivering QoE-Aware Live Media with YOLO-Based Bib Number Recognition" Electronics 14, no. 20: 4115. https://doi.org/10.3390/electronics14204115
APA Styledel Rio, A., Llorente, Á., Ortiz-Arce, S., Belesioti, M., Pappas, G., Muñiz, A., Contreras, L. M., & Christopoulos, D. (2025). Seamless User-Generated Content Processing for Smart Media: Delivering QoE-Aware Live Media with YOLO-Based Bib Number Recognition. Electronics, 14(20), 4115. https://doi.org/10.3390/electronics14204115