Towards Quality Assessment for Arbitrary Translational 6DoF Video: Subjective Quality Database and Objective Assessment Metric
Abstract
:1. Introduction
- (1)
- This paper establishes a new arbitrary translational 6DoF synthesized video quality database for exploring higher-DoF video quality. The database contains five sequences, four levels of compression distortion, and four levels of rendering distortion, with particular focus on three viewpoint navigation paths in 3D space that have not been explored in previous research.
- (2)
- For the specific distortion types of arbitrary translational 6DoF synthesized videos, this paper proposes a no-reference objective quality assessment method. The proposed method leverages multiscale and multi-resolution statistical modeling to extract features for cracks, rendering, and compression distortions, thereby achieving an effective description of distortions with complex spatio-temporal and local–global distributions.
- (3)
- The established subjective QA database offers a novel aspect of view path navigation that previous databases lack. Meanwhile, the proposed objective QA method leverages the spatio-temporal characteristics of distortions to explore the impact of high-DoF distortions on video quality. Extensive experimental results on the established subjective dataset fully validate the superiority of the proposed objective method.
2. Related Works
2.1. Subjective Quality Assessment
2.2. Objective Quality Assessment
3. Subjective Quality Dataset of Arbitrary Translational 6DoF Video
3.1. Generation of Arbitrary Translational 6DoF Videos
3.2. Subjective Quality Testing
3.3. Analysis of Subjective Testing Results
4. Objective VQA Metric for Arbitrary Translational 6DoF Videos
4.1. Distortion Analysis and Objective Metric Framework
4.2. Multi-Resolution and Multiscale Spaces
4.3. Crack Distortion Assessment
4.4. Rendering Distortion Assessment
4.5. Compression Distortion Assessment
4.6. Overall Quality Prediction
5. Experimental Results and Analysis
5.1. Performance Comparison
5.2. Performance Dependency of Multiscale Space
5.3. Performance Dependency of Neighborhood Pixel Range
5.4. Performance Dependency of Auto-Correlation Threshold
5.5. Performance Dependency of Regression Model and Training-Testing Ratio
5.6. Ablation Studies
5.7. Performance of Individual Video Sequence
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Xu, D.; Fan, X.; Gao, W. Multiscale Attention Fusion for Depth Map Super-Resolution Generative Adversarial Networks. Entropy 2023, 25, 836. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y.; Jiang, G.; Yu, M.; Jin, C.; Xu, H.; Ho, Y.-S. HDR Light Field Imaging of Dynamic Scenes: A Learning-Based Method and a Benchmark Dataset. Pattern Recogn. 2024, 150, 110313. [Google Scholar] [CrossRef]
- Liu, S.; Zhu, C.; Li, Z.; Yang, Z.; Gu, W. View-Driven Multi-View Clustering via Contrastive Double-Learning. Entropy 2024, 26, 470. [Google Scholar] [CrossRef]
- Bui, T.H.; Hamamoto, K.; Paing, M.P. Automated Caries Screening Using Ensemble Deep Learning on Panoramic Radiographs. Entropy 2022, 24, 1358. [Google Scholar] [CrossRef] [PubMed]
- Rahaman, D.M.M.; Paul, M. Virtual View Synthesis for Free Viewpoint Video and Multiview Video Compression Using Gaussian Mixture Modelling. IEEE Trans. Image Process. 2018, 27, 1190–1201. [Google Scholar] [CrossRef]
- Okarma, K.; Chlewicki, W.; Kopytek, M.; Marciniak, B.; Lukin, V. Entropy-Based Combined Metric for Automatic Objective Quality Assessment of Stitched Panoramic Images. Entropy 2021, 23, 1525. [Google Scholar] [CrossRef]
- Jung, J.; Kroon, B.; Doré, R.; Lafruit, G.; Boyce, J. Common Test Conditions on 3DoF+ and Windowed 6DoF. In Proceedings of the The 124th MPEG Meeting, Taipa, Macao, 8–12 October 2018. [Google Scholar]
- Guo, S.; Hu, J.; Zou, K.; Wang, J.; Song, L.; Xie, R.; Zhang, W. Real-Time Free Viewpoint Video Synthesis System Based on DIBR and a Depth Estimation Network. IEEE Trans. Multimed. 2024, 26, 6701–6716. [Google Scholar] [CrossRef]
- Wu, T.; Li, W.; Jia, S.; Dong, Y.; Zeng, T. Deep Multi-Level Wavelet-CNN Denoiser Prior for Restoring Blurred Image with Cauchy Noise. IEEE Signal Process. Lett. 2020, 27, 1635–1639. [Google Scholar] [CrossRef]
- Jin, C.; Peng, Z.; Chen, F.; Jiang, G. Subjective and Objective Video Quality Assessment for Windowed-6DoF Synthesized Videos. IEEE Trans. Broadcast. 2022, 68, 594–608. [Google Scholar] [CrossRef]
- Carballeira, P.; Carmona, C.; Díaz, C.; Berjón, D.; Corregidor, D.; Cabrera, J.; Morán, F.; Doblado, C.; Arnaldo, S.; Martín, M.; et al. FVV Live: A Real-Time Free-Viewpoint Video System with Consumer Electronics Hardware. IEEE Trans. Multimed. 2022, 24, 2378–2391. [Google Scholar] [CrossRef]
- IRCCyN/IVC DIBR Video Quality Dataset. Available online: https://qualinet.github.io/databases/video/irccynivc_dibr_videos/ (accessed on 17 December 2013).
- Liu, X.; Zhang, Y.; Hu, S.; Kwong, S.; Kuo, C.-C.J.; Peng, Q. Subjective and Objective Video Quality Assessment of 3D Synthesized Views with Texture/Depth Compression Distortion. IEEE Trans. Image Process. 2015, 24, 4847–4861. [Google Scholar] [CrossRef] [PubMed]
- Installations, T.; Line, L. Subjective video quality assessment methods for multimedia applications. Recomm. ITU-T. P. 910. 1999. [Google Scholar]
- VSRS-1D-Fast. Available online: https://hevc.hhi.fraunhofer.de/svn/svn_3DVCSoftware (accessed on 7 August 2015).
- Ling, S.; Gutiérrez, J.; Gu, K.; Le Callet, P. Prediction of the Influence of Navigation Scan-Path on Perceived Quality of Free-Viewpoint Videos. IEEE J. Sel. Topics Signal Process. 2019, 9, 204–216. [Google Scholar] [CrossRef]
- Yan, J.; Li, J.; Fang, Y.; Che, Z.; Xia, X.; Liu, Y. Subjective and Objective Quality of Experience of Free Viewpoint Videos. IEEE Trans. Image Process. 2022, 31, 3896–3907. [Google Scholar] [CrossRef]
- Moorthy, A.K.; Bovik, A.C. A Two-Step Framework for Constructing Blind Image Quality Indices. IEEE Signal Process. Lett. 2010, 17, 513–516. [Google Scholar] [CrossRef]
- Liu, L.; Liu, B.; Huang, H.; Bovik, A.C. No-Reference Image Quality Assessment Based on Spatial and Spectral Entropies. Signal Process. Image Commun. 2014, 29, 856–863. [Google Scholar] [CrossRef]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a Completely Blind Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
- Mittal, A.; Saad, M.A.; Bovik, A.C. A Completely Blind Video Integrity Oracle. IEEE Trans. Image Process. 2016, 25, 289–300. [Google Scholar] [CrossRef]
- Saad, M.A.; Bovik, A.C.; Charrier, C. Blind Prediction of Natural Video Quality. IEEE Trans. Image Process. 2014, 23, 1352–1365. [Google Scholar] [CrossRef] [PubMed]
- Dendi, S.V.R.; Channappayya, S.S. No-Reference Video Quality Assessment Using Natural Spatiotemporal Scene Statistics. IEEE Trans. Image Process. 2020, 29, 5612–5624. [Google Scholar] [CrossRef]
- Battisti, F.; Le Callet, P. Quality Assessment in the Context of FTV: Challenges, First Answers and Open Issues. IEEE COMSOC MMTC Commun. Front. 2016, 11, 22–27. [Google Scholar]
- Gu, K.; Jakhetiya, V.; Qiao, J.-F.; Li, X.; Lin, W.; Thalmann, D. Model-Based Referenceless Quality Metric of 3D Synthesized Images Using Local Image Description. IEEE Trans. Image Process. 2018, 27, 394–405. [Google Scholar] [CrossRef]
- Gu, K.; Qiao, J.; Lee, S.; Liu, H.; Lin, W.; Le Callet, P. Multiscale Natural Scene Statistical Analysis for No-Reference Quality Evaluation of DIBR-Synthesized Views. IEEE Trans. Broadcast. 2020, 66, 127–139. [Google Scholar] [CrossRef]
- Tian, S.; Zhang, L.; Morin, L.; Déforges, O. NIQSV: A No-Reference Image Quality Assessment Metric for 3D Synthesized Views. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 1248–1252. [Google Scholar]
- Tian, S.; Zhang, L.; Morin, L.; Déforges, O. NIQSV+: A No-Reference Synthesized View Quality Assessment Metric. IEEE Trans. Image Process. 2018, 27, 1652–1664. [Google Scholar] [CrossRef]
- Wang, G.; Wang, Z.; Gu, K.; Li, L.; Xia, Z.; Wu, L. Blind Quality Metric of DIBR-Synthesized Images in the Discrete Wavelet Transform Domain. IEEE Trans. Image Process. 2020, 29, 1802–1814. [Google Scholar] [CrossRef] [PubMed]
- Fang, Z.; Cui, Y.; Yu, M.; Jiang, G.; Lian, K.; Wen, Y.; Xu, J. Blind 3D-Synthesized Image Quality Measurement by Analysis of Local and Global Statistical Properties. IEEE Trans. Instrum. Meas. 2023, 72, 1–15. [Google Scholar] [CrossRef]
- Sandić-Stanković, D.D.; Kukolj, D.D.; Le Callet, P. Fast Blind Quality Assessment of DIBR-Synthesized Video Based on High-High Wavelet Subband. IEEE Trans. Image Process. 2019, 28, 5524–5536. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Y.; Li, L.; Wang, S.; Wu, J.; Zhang, Y. No-Reference Quality Assessment of DIBR-Synthesized Videos by Measuring Temporal Flickering. J. Vis. Commun. Image Represent. 2018, 55, 30–39. [Google Scholar] [CrossRef]
- Boissonade, P.; Jung, J. Proposition of New Sequences for Windowed-6DoF Experiments on Compression, Synthesis, and Depth Estimation; Document ISO/IEC JTC1/SC29/WG11 MPEG M43318; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- Kim, J.; Yun, K.; Jeong, J.; Cheong, W.-S.; Lee, G.; Seo, J. Multiview Contents for Windowed 6DoF: 3 × 5 Moving Picture and 3 × 91 Still Images; Document ISO/IEC JTC1/SC29/WG11 MPEG M50749; ISO: Geneva, Switzerland, 2019. [Google Scholar]
- Bae, S.-J.; Park, S.; Kim, J.W.; Jang, H.; Kim, D.H. Camera Array Based Windowed 6-DoF Moving Picture Contents; Document ISO/IEC JTC1/SC29/WG11 MPEG M42542; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- Doyen, D.; Boisson, G.; Gendrot, R. EE_DEPTH: New Version of the Pseudo-Rectified Technicolor Painter Content; Document ISO/IEC JTC1/SC29/WG11 MPEG M43366; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- VVC VTM-14.0 Reference Platform. Available online: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tree/VTM-14.0 (accessed on 26 August 2021).
- Senoh, T.; Tetsutani, N.; Yasuda, H.; Teratani, M. [MPEG-I Visual] Proposed View Synthesis Reference Software (pVSRS4.3) Manual; Document ISO/IEC JTC1/SC29/WG11 MPEG M44031.v5; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- Senoh, T.; Tetsutani, N.; Yasuda, H. Enhanced VSRS to Four Reference Views; Document ISO/IEC JTC1/SC29/WG11 MPEG M42526; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- Jung, J.; Boissonade, P.; Fournier, J.; Gicquel, J.C. [MPEG-I Visual] Proposition of Navigation Paths and Subjective Evaluation Method for Windowed 6DoF Experiments on Compression, Synthesis, and Depth Estimation; Document ISO/IEC JTC1/SC29/WG11 MPEG M42985; ISO: Geneva, Switzerland, 2018. [Google Scholar]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Friston, K. The Free-Energy Principle: A Unified Brain Theory? Nat. Rev. Neurosci. 2010, 11, 127–138. [Google Scholar] [CrossRef] [PubMed]
- Gu, K.; Zhai, G.; Lin, W.; Yang, X.; Zhang, W. Visual Saliency Detection with Free Energy Theory. IEEE Signal Process. Lett. 2015, 22, 1552–1555. [Google Scholar] [CrossRef]
- Wright, J.; Bourke, P. Markov Blankets and Mirror Symmetries—Free Energy Minimization and Mesocortical Anatomy. Entropy 2024, 26, 287. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, G.; Jin, C.; Luo, T.; Xu, H.; Yu, M. Multi-Attention Learning and Exposure Guidance Toward Ghost-Free High Dynamic Range Light Field Imaging. IEEE Trans. Vis. Comput. Graph. 2024, 30, 1–12. [Google Scholar] [CrossRef]
- Karacan, L.; Erdem, E.; Erdem, A. Structure-Preserving Image Smoothing via Region Covariances. ACM Trans. Graph. 2013, 32, 176:1–176:11. [Google Scholar] [CrossRef]
- Ham, B.; Cho, M.; Ponce, J. Robust Guided Image Filtering Using Nonconvex Potentials. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 192–207. [Google Scholar] [CrossRef] [PubMed]
- Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-Reference Image Quality Assessment in the Spatial Domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef]
- Zuo, Y.; Wang, H.; Fang, Y.; Huang, X.; Shang, X.; Wu, Q. MIG-Net: Multi-Scale Network Alternatively Guided by Intensity and Gradient Features for Depth Map Super-Resolution. IEEE Trans. Multimed. 2022, 24, 3506–3519. [Google Scholar] [CrossRef]
- Shnayderman, A.; Gusev, A.; Eskicioglu, A.M. An SVD-Based Grayscale Image Quality Measure for Local and Global Assessment. IEEE Trans. Image Process. 2006, 15, 422–429. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Sequences | Camera Array | Camera Space (cm) | Resolution | QP Pairs | DIBR Schemes | Paths | Syn-Video | FPS |
---|---|---|---|---|---|---|---|---|
5 × 5 | 11.5 × 11.5 | 1920 × 1080 | No, (35,42), (40,45), (45,48) | S1, S2, S3, S4 | P1, P2, P3 | 48 | 25 | |
5 × 5 | 10 × 10 | 1920 × 1080 | No, (35,42), (40,45), (45,48) | S1, S2, S3, S4 | P1, P2, P3 | 48 | 25 | |
3 × 5 | 15 × 15 | 1920 × 1080 | No, (35,42), (40,45), (45,48) | S1, S2, S3, S4 | P1, P2, P3 | 48 | 25 | |
5 × 5 | 10 × 10 | 1920 × 1080 | No, (35,42), (40,45), (45,48) | S1, S2, S3, S4 | P1, P2, P3 | 48 | 25 | |
4 × 4 | 7 × 7 | 2048 × 1088 | No, (35,42), (40,45), (45,48) | S1, S2, S3, S4 | P1, P2, P3 | 48 | 25 |
Metric | Type | PLCC | SROCC | KROCC | RMSE |
---|---|---|---|---|---|
BIQI [18] | T-IQA NR | 0.2762 | 0.2421 | 0.1666 | 0.8197 |
SSEQ [19] | T-IQA NR | 0.3486 | 0.3237 | 0.2257 | 0.7980 |
NIQE [20] | T-IQA NR | 0.6302 | 0.6087 | 0.4366 | 0.6598 |
VIIDEO [21] | T-VQA NR | 0.3633 | 0.3389 | 0.2372 | 0.7972 |
VB-II [22] | T-VQA NR | 0.8681 | 0.8622 | 0.6762 | 0.4299 |
APT [25] | 1DoF-S IQA NR | 0.4328 | 0.4053 | 0.2859 | 0.7684 |
MNSS [26] | 1DoF-S IQA NR | 0.5325 | 0.5227 | 0.3709 | 0.7152 |
NIQSV [27] | 1DoF-S IQA NR | 0.2769 | 0.2459 | 0.1684 | 0.8205 |
NIQSV+ [28] | 1DoF-S IQA NR | 0.5251 | 0.5267 | 0.3716 | 0.7208 |
Wang et al. [29] | 1DoF-S IQA NR | 0.5005 | 0.4763 | 0.3344 | 0.7363 |
Jin et al. [10] | Win-6DoF-S VQA NR | 0.8308 | 0.8150 | 0.6364 | 0.4260 |
Proposed | Tra-6DoF-S VQA NR | 0.8923 | 0.8700 | 0.6967 | 0.3886 |
Scale Number | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
PLCC | 0.8853 | 0.8892 | 0.8904 | 0.8923 | 0.8936 |
SROCC | 0.8534 | 0.8630 | 0.8688 | 0.8700 | 0.8687 |
KROCC | 0.6709 | 0.6816 | 0.6942 | 0.6967 | 0.6963 |
RMSE | 0.4094 | 0.3977 | 0.3905 | 0.3886 | 0.3891 |
3 | 5 | 7 | 9 | |
---|---|---|---|---|
PLCC | 0.8923 | 0.8990 | 0.8946 | 0.8738 |
SROCC | 0.8700 | 0.8703 | 0.8721 | 0.8580 |
KROCC | 0.6967 | 0.7028 | 0.6979 | 0.6327 |
RMSE | 0.3886 | 0.3805 | 0.3855 | 0.3953 |
Time (m) | 2.9 | 5.2 | 6.6 | 7.1 |
d | 20 | 40 | 60 | 80 | 100 | 120 | 140 |
---|---|---|---|---|---|---|---|
PLCC | 0.8841 | 0.9003 | 0.8905 | 0.9021 | 0.8923 | 0.8627 | 0.8411 |
SROCC | 0.8258 | 0.8528 | 0.8476 | 0.8683 | 0.8700 | 0.8549 | 0.8269 |
KROCC | 0.6266 | 0.6544 | 0.6557 | 0.6846 | 0.6967 | 0.6813 | 0.6373 |
RMSE | 0.4487 | 0.4094 | 0.4026 | 0.3933 | 0.3886 | 0.3878 | 0.4418 |
Regression Model | Training-Testing Ratio | PLCC | SROCC | KROCC | RMSE |
---|---|---|---|---|---|
SVR | 50-50% | 0.8293 | 0.8177 | 0.6322 | 0.4468 |
60-40% | 0.8576 | 0.8381 | 0.6430 | 0.4243 | |
70-30% | 0.8750 | 0.8579 | 0.6665 | 0.4023 | |
80-20% | 0.8884 | 0.8695 | 0.6788 | 0.3961 | |
90-10% | 0.8980 | 0.8836 | 0.6898 | 0.3754 | |
RF | 50-50% | 0.8398 | 0.8340 | 0.6399 | 0.4357 |
60-40% | 0.8585 | 0.8472 | 0.6545 | 0.4169 | |
70-30% | 0.8759 | 0.8654 | 0.6771 | 0.3988 | |
80-20% | 0.8923 | 0.8700 | 0.6967 | 0.3886 | |
90-10% | 0.9180 | 0.8985 | 0.7398 | 0.3294 |
Model | Distortion Component | PLCC | SROCC | KROCC | RMSE | ||
---|---|---|---|---|---|---|---|
fcom | fren | fcra | |||||
1 | √ | 0.8172 | 0.8106 | 0.6251 | 0.4743 | ||
2 | √ | 0.8563 | 0.8276 | 0.6366 | 0.4395 | ||
3 | √ | 0.6028 | 0.5696 | 0.4630 | 0.6821 | ||
4 | √ | √ | 0.8737 | 0.8518 | 0.6697 | 0.3700 | |
5 | √ | √ | 0.8674 | 0.8270 | 0.6444 | 0.4529 | |
6 | √ | √ | 0.8600 | 0.8399 | 0.6708 | 0.4310 | |
7 | √ | √ | √ | 0.8923 | 0.8700 | 0.6967 | 0.3886 |
Sequences | RMSE |
---|---|
0.2513 | |
0.2929 | |
0.1570 | |
0.2934 | |
0.3305 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, C.; Chen, Y. Towards Quality Assessment for Arbitrary Translational 6DoF Video: Subjective Quality Database and Objective Assessment Metric. Entropy 2025, 27, 44. https://doi.org/10.3390/e27010044
Jin C, Chen Y. Towards Quality Assessment for Arbitrary Translational 6DoF Video: Subjective Quality Database and Objective Assessment Metric. Entropy. 2025; 27(1):44. https://doi.org/10.3390/e27010044
Chicago/Turabian StyleJin, Chongchong, and Yeyao Chen. 2025. "Towards Quality Assessment for Arbitrary Translational 6DoF Video: Subjective Quality Database and Objective Assessment Metric" Entropy 27, no. 1: 44. https://doi.org/10.3390/e27010044
APA StyleJin, C., & Chen, Y. (2025). Towards Quality Assessment for Arbitrary Translational 6DoF Video: Subjective Quality Database and Objective Assessment Metric. Entropy, 27(1), 44. https://doi.org/10.3390/e27010044