Application of Optimized Adaptive Neuro-Fuzzy Inference for High Frame Rate Video Quality Assessment
Abstract
:Featured Application
Abstract
1. Introduction
1.1. Motivation
1.2. Key Research Contributions
- The integration of neural network techniques within the fuzzy inference system allows the model to simultaneously learn membership function parameters and rules, resulting in a more adaptive and robust system.
- By employing subtractive clustering with optimized parameters such as range of influence, squash factor, and accept/reject ratios, the model generates a lightweight, compact rule base that minimizes computational demands without sacrificing accuracy.
- The new model achieves lower RMSE (Root Mean Squared Error) and higher PCC (Pearson Correlation Coefficient) and SROCC (Spearman Rank-Order Correlation Coefficient) compared to traditional, as well as recently developed, VQA objective models, enhancing the accuracy of MOS prediction.
1.3. Organization of the Manuscript
2. Dataset Used for the Model Development
2.1. Video-Sequence Preparation and Subjective Quality Assessment
2.2. Dataset Limitations
3. VQA Model Design
3.1. Data Preparation and Cross-Validation Strategy
3.2. Subtractive Clustering Parameter Selection
- Range of influence (RoI) defines the radius within the normalized data space in which an individual data point influences the “potential” of a candidate cluster center. Thus, it determines how far the influence of a potential cluster center is computed. A larger range of influence means that each data point affects a broader area, which can lead to a smaller number of clusters.
- After the first cluster center is chosen, the potential of all other data points in the vicinity is reduced (“squashed”) around that center. The squash factor (SF) determines how much the potential is diminished in these regions. A higher squash factor results in a stronger reduction of potential near the selected center, which can prevent overlapping clusters and ensure a clearer separation.
- The accept ratio (AR) sets a threshold (relative to the potential of the first selected cluster center) above which a candidate is automatically accepted as the new center.
- In contrast to the accept ratio, the reject ratio (RR) establishes a threshold below which a candidate is automatically rejected.
3.3. Membership Functions and Compact Rule Base
3.4. Surface Visualisation and Sensitivity Analysis
4. Review and Comparative Evaluation of VQA Models
4.1. Review of VQA Models
4.2. Performance Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
ANFIS | Adaptive Neuro-Fuzzy Inference System |
AR | Accept Ratio |
BH-VQA | Blind High Frame Rate Video Quality Assessment |
CRF | Constant Rate Factor |
FCM | Fuzzy C-Means |
FIS | Fuzzy Inference System |
FPS | Frames per Second |
FRQM | Frame Rate Quality Metric |
FSIM | Feature Similarity Index |
GSTI | Gradient-based Spatio-Temporal Index |
HFR | High Frame Rate |
MOS | Mean Opinion Score |
RMSE | Root Mean Squared Error |
MS-SSIM | Multi-Scale Structural Similarity |
PCC | Pearson Correlation Coefficient |
PSNR | Peak Signal-to-Noise Ratio |
QoE | Quality of Experience |
RoI | Range of Influence |
RR | Reject Ratio |
SAR | Synthetic Aperture Radar |
SD | Standard Deviation |
SF | Squash Factor |
SI | Spatial Information |
SpEED | Spatio-Temporal Entropic Differencing |
SROCC | Spearman Rank-Order Correlation Coefficient |
SSCQE | Single-Stimulus Continuous Quality Evaluation |
SSIM | Structural Similarity Index |
ST-RRED | Spatio-Temporal Reduced Reference Entropy Difference |
TI | Temporal Information |
VQA | Video Quality Assessment |
VMAF | Video Multi-method Assessment Fusion |
References
- AppLogic Networks. Global Internet Phenomena Report. 2025. Available online: https://www.applogicnetworks.com/phenomena (accessed on 19 March 2025).
- Lu, W.; Sun, W.; Zhang, Z.; Tu, D.; Min, X.; Zhai, G. BH-VQA: Blind High Frame Rate Video Quality Assessment. In Proceedings of the 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 10–14 July 2023; 2023; pp. 2501–2506. [Google Scholar]
- Madhusudana, P.C.; Yu, X.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A.C. Subjective and Objective Quality Assessment of High Frame Rate Videos. IEEE Access 2021, 9, 108069–108082. [Google Scholar] [CrossRef]
- Matulin, M.; Mrvelj, Š. Modelling User Quality of Experience from Objective and Subjective Data Sets Using Fuzzy Logic. Multimed. Syst. 2018, 24, 645–667. [Google Scholar] [CrossRef]
- Madhusudana, P.C.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A.C. High Frame Rate Video Quality Assessment Using VMAF and Entropic Differences. In Proceedings of the 2021 Picture Coding Symposium (PCS), Virtual, 29 June–2 July 2021; 2021; pp. 1–5. [Google Scholar]
- Mrvelj, Š.; Matulin, M. FLAME-VQA: A Fuzzy Logic-Based Model for High Frame Rate Video Quality Assessment. Future Internet 2023, 15, 295. [Google Scholar] [CrossRef]
- Chennagiri, P.; Yu, X.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A. LIVE YouTube High Frame Rate (LIVE-YT-HFR) Database. Available online: https://live.ece.utexas.edu/research/LIVE_YT_HFR/LIVE_YT_HFR/index.html (accessed on 3 February 2025).
- Mackin, A.; Zhang, F.; Bull, D.R. A Study of Subjective Video Quality at Various Frame Rates. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 3407–3411. [Google Scholar]
- ITU-R. ITU-R BT.500-11: Methodology for the Subjective Assessment of the Quality of Television Pictures; ITU-R Stands for International Telecommunication Union-Radiocommunication Sector: Geneva, Switzerland, 2000; Available online: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.500-11-200206-S!!PDF-E.pdf (accessed on 20 April 2025).
- MathWorks Neuro-Adaptive Learning and ANFIS. Available online: https://www.mathworks.com/help/fuzzy/neuro-adaptive-learning-and-anfis.html (accessed on 11 March 2025).
- Rao, U.M.; Sood, Y.R.; Jarial, R.K. Subtractive Clustering Fuzzy Expert System for Engineering Applications. Procedia Comput. Sci. 2015, 48, 77–83. [Google Scholar] [CrossRef]
- Chang, S.; Deng, Y.; Zhang, Y.; Zhao, Q.; Wang, R.; Zhang, K. An Advanced Scheme for Range Ambiguity Suppression of Spaceborne SAR Based on Blind Source Separation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
- Ross, T.J. Fuzzy Logic with Engineering Applications, 4th ed.; Wiley: Hoboken, NJ, USA, 2016. [Google Scholar]
- Al-Hadithi, B.M.; Gómez, J. Fuzzy Control of Multivariable Nonlinear Systems Using T–S Fuzzy Model and Principal Component Analysis Technique. Processes 2025, 13, 217. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale Structural Similarity for Image Quality Assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; Volume 2, pp. 1398–1402. [Google Scholar]
- Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [PubMed]
- Soundararajan, R.; Bovik, A.C. Video Quality Assessment by Reduced Reference Spatio-Temporal Entropic Differencing. IEEE Trans. Circuits Syst. Video Technol. 2013, 23, 684–694. [Google Scholar] [CrossRef]
- Bampis, C.G.; Gupta, P.; Soundararajan, R.; Bovik, A.C. SpEED-QA: Spatial Efficient Entropic Differencing for Image and Video Quality. IEEE Signal Process. Lett. 2017, 24, 1333–1337. [Google Scholar] [CrossRef]
- Zhang, F.; Mackin, A.; Bull, D.R. A Frame Rate Dependent Video Quality Metric Based on Temporal Wavelet Decomposition and Spatiotemporal Pooling. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 300–304. [Google Scholar]
- Kim, W.; Kim, J.; Ahn, S.; Kim, J.; Lee, S. Deep Video Quality Assessor: From Spatio-Temporal Visual Sensitivity to A Convolutional Neural Aggregation Network. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Madhusudana, P.C.; Birkbeck, N.; Wang, Y.; Adsumilli, B.; Bovik, A.C. Capturing Video Frame Rate Variations via Entropic Differencing. IEEE Signal Process. Lett. 2020, 27, 1809–1813. [Google Scholar] [CrossRef]
- Ramachandra Rao, R.R.; Göring, S.; Raake, A. AVQBits—Adaptive Video Quality Model Based on Bitstream Information for Various Video Applications. IEEE Access 2022, 10, 80321–80351. [Google Scholar] [CrossRef]
- Netflix VMAF—Video Multi-Method Assessment Fusion. Available online: https://github.com/Netflix/vmaf (accessed on 12 January 2025).
- Kotevski, Z.; Mitrevski, P. Experimental Comparison of PSNR and SSIM Metrics for Video Quality Estimation. In Proceedings of the ICT Innovations 2009, Ohrid, Macedonia, 28–30 September 2009; Davcev, D., Gómez, J.M., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 357–366. [Google Scholar]
Rank | RoI | SF | AR | RR | Mean RMSE ± SD | Mean PCC |
---|---|---|---|---|---|---|
1 | 0.8 | 1.1 | 0.5 | 0.1 | 3.080 ± 0.317 | 0.913 |
2 | 0.8 | 1.1 | 0.6 | 0.1 | 3.080 ± 0.318 | 0.913 |
3 | 0.7 | 1.1 | 0.5 | 0.3 | 3.105 ± 0.342 | 0.907 |
4 | 0.7 | 1.1 | 0.6 | 0.3 | 3.105 ± 0.343 | 0.907 |
5 | 0.8 | 1.1 | 0.5 | 0.2 | 3.118 ± 0.346 | 0.909 |
Parameter | Levels Tested | H-Statistic | p-Value | Interpretation |
---|---|---|---|---|
RoI | 0.50/0.60/0.70/0.80 | 11.77 | 0.0082 | RoI has a statistically significant effect ( = 0.05). |
SF | 1.1/1.3/1.5 | 123.88 | 1.3 × 10⁻27 | SF is the dominant driver of performance. |
AR | 0.5/0.6/0.7 | 0.00 | 1.000 | AR has no measurable impact in the tested range. |
RR | 0.1/0.2/0.3 | 62.98 | 2.1 × 10⁻14 | Reject ratio significantly affects RMSE. |
Video CRF | Video FPS | Video SI | Video TI | Predicted MOS | Consequent Parameters |
---|---|---|---|---|---|
High | Medium | HighDetail | HighMotion | OutputFunction1 | (−0.87, 0.12, 0.73, 0.14, −34.83) |
Medium | VeryLow | HighDetail | VeryHighMotion | OutputFunction2 | (−0.06, 0.42, 0.12, −0.03, 5.48) |
Low | High | MediumDetail | HighMotion | OutputFunction3 | (0.07, 0.08, 0.1, −0.05, 20.53) |
VeryHigh | Medium | MediumDetail | MediumMotion | OutputFunction4 | (−3.67, 0.13, 0.48, 0.11, 246.35) |
High | VeryHigh | LowDetail | LowMotion | OutputFunction5 | (−0.52, 0.19, −0.31, −0.06, 31.72) |
VeryHigh | VeryLow | HighDetail | VeryHighMotion | OutputFunction6 | (−0.49, 0.3, 0.14, −0.002, 25.79) |
Medium | Low | LowDetail | MediumMotion | OutputFunction7 | (−0.12, 0.14, −0.49, 0.02, 31.62) |
VQA Model | RMSE | PCC | SROCC |
---|---|---|---|
PSNR | 9.023 | 0.6685 | 0.695 |
SSIM | 10.819 | 0.4526 | 0.4494 |
MS-SSIM | 10.726 | 0.4673 | 0.4898 |
FSIM | 10.502 | 0.5008 | 0.5251 |
ST-RRED | 10.431 | 0.5107 | 0.5531 |
SpEED | 10.866 | 0.4449 | 0.4861 |
FRQM | 10.804 | 0.452 | 0.4216 |
DeepVQA | 11.441 | 0.3329 | 0.3463 |
GSTI | 7.422 | 0.791 | 0.7909 |
AVQBits|M3 | * | 0.7805 | 0.7118 |
AVQBits|M1 | * | 0.5528 | 0.4809 |
AVQBits|M0 | * | 0.5538 | 0.4947 |
AVQBits|H0|s | * | 0.7887 | 0.7324 |
AVQBits|H0|f | * | 0.7242 | 0.674 |
VMAF | 8.587 | 0.7071 | 0.7303 |
FLAME-VQA | 2.9598 | 0.9086 | 0.8961 |
BH-VQA (LSVQ) | 6.303 | 0.847 | 0.816 |
BH-VQA (ImgNet) | 7.401 | 0.795 | 0.784 |
New ANFIS-VQA | 2.9091 | 0.9174 | 0.9048 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Matulin, M.; Mrvelj, Š.; Periša, M.; Grgurević, I. Application of Optimized Adaptive Neuro-Fuzzy Inference for High Frame Rate Video Quality Assessment. Appl. Sci. 2025, 15, 5018. https://doi.org/10.3390/app15095018
Matulin M, Mrvelj Š, Periša M, Grgurević I. Application of Optimized Adaptive Neuro-Fuzzy Inference for High Frame Rate Video Quality Assessment. Applied Sciences. 2025; 15(9):5018. https://doi.org/10.3390/app15095018
Chicago/Turabian StyleMatulin, Marko, Štefica Mrvelj, Marko Periša, and Ivan Grgurević. 2025. "Application of Optimized Adaptive Neuro-Fuzzy Inference for High Frame Rate Video Quality Assessment" Applied Sciences 15, no. 9: 5018. https://doi.org/10.3390/app15095018
APA StyleMatulin, M., Mrvelj, Š., Periša, M., & Grgurević, I. (2025). Application of Optimized Adaptive Neuro-Fuzzy Inference for High Frame Rate Video Quality Assessment. Applied Sciences, 15(9), 5018. https://doi.org/10.3390/app15095018