Next Article in Journal
ANSYS-Based Modeling and Simulation of Electrostatic Oil-Line Sensor
Previous Article in Journal
Spatio-Temporal Collaborative Perception-Enabled Fault Feature Graph Construction and Topology Mining for Variable Operating Conditions Diagnosis
Previous Article in Special Issue
Condensation of Data and Knowledge for Network Traffic Classification: Techniques, Applications, and Open Issues
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Perspective

A Perspective on Quality Evaluation for AI-Generated Videos

Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200030, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(15), 4668; https://doi.org/10.3390/s25154668
Submission received: 17 April 2025 / Revised: 8 July 2025 / Accepted: 21 July 2025 / Published: 28 July 2025
(This article belongs to the Special Issue Perspectives in Intelligent Sensors and Sensing Systems)

Abstract

Recent breakthroughs in AI-generated content (AIGC) have transformed video creation, empowering systems to translate text, images, or audio into visually compelling stories. Yet reliable evaluation of these machine-crafted videos remains elusive because quality is governed not only by spatial fidelity within individual frames but also by temporal coherence across frames and precise semantic alignment with the intended message. The foundational role of sensor technologies is critical, as they determine the physical plausibility of AIGC outputs. In this perspective, we argue that multimodal large language models (MLLMs) are poised to become the cornerstone of next-generation video quality assessment (VQA). By jointly encoding cues from multiple modalities such as vision, language, sound, and even depth, the MLLM can leverage its powerful language understanding capabilities to assess the quality of scene composition, motion dynamics, and narrative consistency, overcoming the fragmentation of hand-engineered metrics and the poor generalization ability of CNN-based methods. Furthermore, we provide a comprehensive analysis of current methodologies for assessing AIGC video quality, including the evolution of generation models, dataset design, quality dimensions, and evaluation frameworks. We argue that advances in sensor fusion enable MLLMs to combine low-level physical constraints with high-level semantic interpretations, further enhancing the accuracy of visual quality assessment.
Keywords: video quality assessment; AI-generated video; MLLM video quality assessment; AI-generated video; MLLM

Share and Cite

MDPI and ACS Style

Zhang, Z.; Sun, W.; Zhai, G. A Perspective on Quality Evaluation for AI-Generated Videos. Sensors 2025, 25, 4668. https://doi.org/10.3390/s25154668

AMA Style

Zhang Z, Sun W, Zhai G. A Perspective on Quality Evaluation for AI-Generated Videos. Sensors. 2025; 25(15):4668. https://doi.org/10.3390/s25154668

Chicago/Turabian Style

Zhang, Zhichao, Wei Sun, and Guangtao Zhai. 2025. "A Perspective on Quality Evaluation for AI-Generated Videos" Sensors 25, no. 15: 4668. https://doi.org/10.3390/s25154668

APA Style

Zhang, Z., Sun, W., & Zhai, G. (2025). A Perspective on Quality Evaluation for AI-Generated Videos. Sensors, 25(15), 4668. https://doi.org/10.3390/s25154668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop