Perception-Driven and Object-Aware Fast MTT Partitioning for H.266/VVC: A Saliency-Guided Complexity Reduction Framework

Chih-Ying Lin; Jia-Yi Yeh; Yu-Cheng Chen; Yi-Fan Li; Chih-Ming Lien; Mei-Juan Chen; Chia-Hung Yeh

doi:10.3390/electronics15010133

,

and

¹

Department of Electrical Engineering, National Dong Hwa University, Hualien 974, Taiwan

²

Department of Electrical Engineering, National Taiwan Normal University, Taipei 106, Taiwan

³

Department of Electrical Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan

^*

Authors to whom correspondence should be addressed.

Electronics2026, 15(1), 133;https://doi.org/10.3390/electronics15010133
(registering DOI)

This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence, 2nd Edition

Version Notes

Order Reprints

Abstract

The H.266/Versatile Video Coding (VVC) standard was developed to address the growing demand for compressing ultra-high-definition video content, supporting resolutions ranging from 4K to 8K and beyond. H.266/VVC improves coding efficiency by introducing a flexible quadtree with nested multi-type tree (QT-MTT) partitioning and various advanced coding tools. However, these improvements substantially increase the encoding complexity. To address this issue, we propose a perception-driven and object-aware algorithm that accelerates the MTT process in H.266/VVC intra coding. Our method integrates pixel-level saliency detection with object bounding box detection. Specifically, visually distinguishable (VD) pixels are identified using a just noticeable distortion (JND) model based on average background luminance, while detected-object regions are extracted using a YOLO object detection network. These two types of perceptual information are combined to guide adaptive encoding decisions. For each frame, a perception-driven pixel map labeled with VD pixels and a YOLO-based object map are generated. Within the MTT framework, partitioning decisions are determined jointly by standard deviation metrics derived from VD pixels and detected-object region coverage. By incorporating flexible threshold settings, the proposed method can meet different users’ requirements. In this paper, we performed experiments under three threshold settings. The experimental results demonstrate that the proposed method reduces H.266/VVC intra coding time by 27.94% to 43.11%, with BDBR increases of only 1.02% to 1.53%, thus achieving an appropriate trade-off between encoding speed and coding efficiency.

Keywords:

H.266/VVC; multi-type tree; fast intra coding; visual perception; object detection

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.