Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

FLAME-VQA: A Fuzzy Logic-Based Model for High Frame Rate Video Quality Assessment

Future Internet 2023, 15(9), 295; https://doi.org/10.3390/fi15090295

by Štefica Mrvelj

and Marko Matulin^*

Reviewer 1: Anonymous

Reviewer 2:

Abdussalam Elhanashi

Future Internet 2023, 15(9), 295; https://doi.org/10.3390/fi15090295

Submission received: 14 August 2023 / Revised: 25 August 2023 / Accepted: 29 August 2023 / Published: 1 September 2023

(This article belongs to the Special Issue QoS in Wireless Sensor Network for IoT Applications)

Round 1

Reviewer 1 Report

The authors have proposed a fuzzy logic-based model for high frame rate video quality assessment. The algorithm is analysed on a data set and results are discussed. Fuzzy Logic-Based Models for High Frame Rate Video Quality Assessment face several challenges due to the intricacies of video content, human perception, and the specific requirements of high frame rate (HFR) video. I suggest the authors to include the following information in the manuscript.

1. Human perception of video quality can differ significantly at high frame rates compared to standard frame rates. The relationship between frame rate and perceived quality may not be linear, and different viewers might have varying preferences for HFR content. Fuzzy logic models need to account for these perceptual differences effectively. Do all the 85 participants that have analysed 480 test sequences have same level of experimental equipment?

2. If not, it is nice to add the range of values that have been used by the participants. Video quality assessment also varies with age, so these factors should be properly listed in the paper. They can not be generalised.

3. Developing and training fuzzy logic-based models requires a sufficient amount of high-quality data. However, compared to regular frame rate video datasets, high-quality high frame rate datasets are less prevalent. The limited availability of such datasets can hinder the training and evaluation of HFR quality assessment models. Your proposed algorithm is tested on LIVE-YT-HFR. Have you analysed your algorithm with other data-sets available?

4. Fuzzy logic models need to capture temporal dependencies and interactions among frames effectively. In HFR videos, frames are closely spaced in time, leading to stronger temporal dependencies that must be accounted for to ensure accurate quality assessment. Though your paper have some information on Temporal Information (TI). But it needs more in-depth discussion.

5. HFR content can span various genres, such as sports, movies, and animations. Fuzzy logic models should be capable of adapting to different content types and genres to provide accurate quality assessment across a wide range of scenarios. It is nice to discuss this in the paper.

Minor English language editing required.

Author Response

Point 1: Human perception of video quality can differ significantly at high frame rates compared to standard frame rates. The relationship between frame rate and perceived quality may not be linear, and different viewers might have varying preferences for HFR content. Fuzzy logic models need to account for these perceptual differences effectively. Do all the 85 participants that have analysed 480 test sequences have same level of experimental equipment?

Response 1: The reviewer is right when stating that HFR content needs specific attention when analyzing the video quality since the concept of ‘quality’ varies with different users and viewing conditions. Thus, it makes sense to elaborate on the testing conditions from the experiments conducted in [9]. All participants evaluated the videos in the same conditions. This is now elaborated on in the paper as well as a brief overview of those viewing conditions (subsection 3.2).

Point 2: If not, it is nice to add the range of values that have been used by the participants. Video quality assessment also varies with age, so these factors should be properly listed in the paper. They can not be generalised.

Response 2: All test participants rated the video quality in the same conditions. The information about the participants' age group is a part of subsection 3.2, first paragraph. There is no further information about the age groups in [9], thus, we are unable to provide more details.

Point 3: Developing and training fuzzy logic-based models requires a sufficient amount of high-quality data. However, compared to regular frame rate video datasets, high-quality high frame rate datasets are less prevalent. The limited availability of such datasets can hinder the training and evaluation of HFR quality assessment models. Your proposed algorithm is tested on LIVE-YT-HFR. Have you analysed your algorithm with other data-sets available?

Response 3: At this stage, we tested our algorithm against other algorithms on the same dataset. In our future work, we could expand our study to other datasets. We have added this new path to our research in the concluding section.

Point 4: Fuzzy logic models need to capture temporal dependencies and interactions among frames effectively. In HFR videos, frames are closely spaced in time, leading to stronger temporal dependencies that must be accounted for to ensure accurate quality assessment. Though your paper have some information on Temporal Information (TI). But it needs more in-depth discussion.

Response 4: The reviewer is correct. The TI as a metric gains importance when evaluating the quality of HFR videos. We have added more information about how the model output varies based on the video FPS and TI (subsection 5.1).

Point 5: HFR content can span various genres, such as sports, movies, and animations. Fuzzy logic models should be capable of adapting to different content types and genres to provide accurate quality assessment across a wide range of scenarios. It is nice to discuss this in the paper.

Response 5: Continuing on the previous response, we have added more information about the sequences, including the genre information (subsection 3.1).

The authors wish to thank Reviewer 1 for the received comments. The experiment and the sequences used in it are now explained in more detail as well as the obtained results, which only improves the manuscript. We also gained another possible path for our future research.

Reviewer 2 Report

This research introduces FLAME-VQA, a Fuzzy Logic-based Model for Video Quality Assessment. It addresses challenges in predicting user perception by leveraging a database with video sequences and subjective ratings. FLAME-VQA utilizes fuzzy logic to capture individual preferences and video attributes, yielding high Spearman and Pearson correlation coefficients (SROCC and PCC) with ground truth.

1. What is the primary focus of the article, and why is predicting user perception of video quality important in this context?

2. It is suggested to add the following article in the related work:-

- Sergio Saponara, Abdussalam Elhanashi, Alessio Gagliardi, "Reconstruct fingerprint images using deep learning and sparse autoencoder algorithms," Proc. SPIE 11736, Real-Time Image Processing and Deep Learning 2021, 1173603 (12 April 2021); https://doi.org/10.1117/12.2585707

3. How does the article address the challenges of assessing user perception and quality in video streaming?

4. Can you explain the rationale behind introducing the Fuzzy Logic-based Model for Video Quality Assessment (FLAME-VQA)?

5. Could you provide more details about the LIVE-YT-HFR database and how it was used in this study?

6. How were the four input parameters (video frame rate, compression rate, spatio-temporal information) fuzzified using the Fuzzy C-Means clustering approach?

7. What criteria were used to determine the optimal number of clusters for each input parameter?

8. Can you elaborate on the process of defining the rule-based system for the fuzzy inference system (FIS)?

9. How was the defuzzification process performed to obtain crisp results for video quality assessment?

Further proofreading is reqiured

Author Response

Point 1: What is the primary focus of the article, and why is predicting user perception of video quality important in this context?

Response 1: We have added the requested discussion on the primary focus of our study in subsection 1.2.

Point 2: It is suggested to add the following article in the related work:

Sergio Saponara, Abdussalam Elhanashi, Alessio Gagliardi, "Reconstruct fingerprint images using deep learning and sparse autoencoder algorithms," Proc. SPIE 11736, Real-Time Image Processing and Deep Learning 2021, 1173603 (12 April 2021); https://doi.org/10.1117/12.2585707

Response 2: As we indicated in Section 2, our review is focused on works that presented models or algorithms for the evaluation of video quality, specifically those studies that made an effort to assess/predict user perception in the QoE context. The proposed article is within the domain of fingerprint reconstruction from images and we believe that it falls out of the scope of our review presented in the paper.

Point 3: How does the article address the challenges of assessing user perception and quality in video streaming?

Response 3: In subsection 1.1 we discuss how we addressed the challenges (text lines no. 90 onwards). We elaborate on proposing the fuzzy logic-based model that can capture nuances of human perception of video quality and produce reliable predictions of user quality ratings for different video streaming scenarios.

Point 4: Can you explain the rationale behind introducing the Fuzzy Logic-based Model for Video Quality Assessment (FLAME-VQA)?

Response 4: This comment relates to comments no. 1 and 3, and we believe the reviewer will appreciate the extended discussion about the purpose of the research and introduction of fuzzy logic as a model’s driving engine.

Point 5: Could you provide more details about the LIVE-YT-HFR database and how it was used in this study?

Response 5: We have elaborated in more detail in subsections 3.1 and 3.2 on the database and how it was used. We also added more information on Video TI and its effect on the model output in subsection 5.1.

Point 6: How were the four input parameters (video frame rate, compression rate, spatio-temporal information) fuzzified using the Fuzzy C-Means clustering approach?

Response 6: We expanded our discussion on fuzzification and FCM clustering in subsection 4.2.1.

Point 7: What criteria were used to determine the optimal number of clusters for each input parameter?

Response 7: The decision regarding the optimal number of clusters for the FCM algorithm initially involved a heuristic approach using two clustering validity indices, namely the FCM partition coefficient (PC) and partition entropy (PE). Hence, in this initial stage, each input parameter was described with two fuzzy clusters. As our analysis progressed, we observed that the FCM partitions formed with two clusters per input variable did not fully capture the complexity and nuances of the video quality assessment task. Subsequently, we decided to refine our approach by introducing an additional cluster for each input variable, resulting in a total of three clusters for each. This is elaborated in subsection 4.2.1.

Point 8: Can you elaborate on the process of defining the rule-based system for the fuzzy inference system (FIS)?

Response 8: The process of defining the fuzzy rules of the model’s inference system is now explained in subsection 4.2.2.

Point 9: How was the defuzzification process performed to obtain crisp results for video quality assessment?

Response 9: We used the centroid method to obtain the crisp result. The method is now briefly explained in subsection 4.2.2.

The authors wish to thank Reviewer 2 for the received comments. The process standing behind the inference system of the model is now clarified which is beneficial to the overall text flow of the paper.

Round 2

Reviewer 1 Report

I have read the revised manuscript. Most of my questions have been answered and manuscript is improved. I recommend to accept this article.

Minor editing required.

Reviewer 2 Report

Thanks to suthors for their response and implementation

Article Menu

FLAME-VQA: A Fuzzy Logic-Based Model for High Frame Rate Video Quality Assessment

Further Information

Guidelines

MDPI Initiatives

Follow MDPI