Activity-Based Block Partitioning Decision Method for Versatile Video Coding

Yoon, Yong-Uk; Kim, Jae-Gon

doi:10.3390/electronics11071061

Open AccessArticle

Activity-Based Block Partitioning Decision Method for Versatile Video Coding

by

Yong-Uk Yoon

and

Jae-Gon Kim

^*

School of Electronic and Information Engineering, Korea Aerospace University, Goyang 10540, Korea

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(7), 1061; https://doi.org/10.3390/electronics11071061

Submission received: 26 February 2022 / Revised: 23 March 2022 / Accepted: 24 March 2022 / Published: 28 March 2022

(This article belongs to the Section Electronic Multimedia)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Versatile Video Coding (VVC), the latest international video coding standard, has more than twice the compression performance of High-Efficiency Video Coding (HEVC) through adopting various coding techniques. The multi-type tree (MTT) block structure offers more advanced flexible block partitioning by allowing the binary tree (BT) and ternary tree (TT) structures, as well as the quadtree (QT) structure. Because VVC selects the optimal block partition by performing encoding on all possible CU partitions, the encoding complexity increases enormously. In this paper, we observe the relationship between block partitions and activity that indicates block texture complexity. Based on experimental observations, we propose an activity-based fast block partitioning decision method to reduce the encoding complexity. The proposed method uses only information of the current block without using the information of neighboring or upper blocks, and also minimizes the dependency on QP. For these reasons, the proposed algorithm is simple and parallelizable. In addition, by utilizing the gradient calculation used in VVC’s ALF, a VVC-friendly fast algorithm was designed. The proposed method consists of two-step decision-making processes. The first step terminates the block partitioning early based on observed posterior probability through the relationship between the block size and activity per sample. Next, the sub-activities of the current block are used to determine the type and direction of partitioning. The experimental results show that in the all-intra configuration, the proposed method can reduce the encoding time of the VVC test model (VTM) by up to 45.15% with 2.80% BD-rate loss.

Keywords:

VVC; block partitioning structure; fast partitioning decision; multi-type tree; encoder complexity

1. Introduction

Owing to a rapid evolution of the media environment and widespread use of immersive media contents requiring high resolution, such as ultra-high definition (UHD) and virtual reality (VR), there is a need for even more efficient compression than the existing High-Efficiency Video Coding (HEVC) standard. To fulfill these requirements of the next-generation video coding standard, the Joint Video Experts Team (JVET), a collaborative team comprising the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), has developed Versatile Video Coding (VVC).

In VVC, many new coding techniques have been employed to achieve more than twice the coding efficiency of HEVC. First, as a new block partitioning structure that allows not only quadtree (QT) but also binary tree (BT) and ternary tree (TT) split, and enables efficient encoding with distinct local characteristics for intra- and inter prediction, multi-type tree (MTT) has been adopted. In intra-prediction, tools for exploiting spatial correlations are adopted, for instance, 67 intra-prediction modes, wide-angle intra-prediction, and a cross-component linear model. In addition, matrix-based intra-prediction outputs intra-coded blocks based on matrix multiplication operations defined by neural networks. In inter-prediction, various tools are used to obtain more precise motion information. For example, merge mode with motion vector differences and geometric partitioning mode, which facilitate the nonrectangular partitioning of blocks, are available as variants of the merge mode. With the introduction of several other new tools for transform, in-loop filtering, and entropy coding, VVC offers a 36.06% improvement in performance in terms of the Bjontegaard Delta rate (BD-rate) over HEVC for UHD and HD sequences [1]. However, as the coding efficiency increases with the adoption of these new tools, the encoding and decoding complexities increase by 8.5 times and 1.5 times, respectively [1]. It is indispensable to reduce these levels of complexity for realizing real-time applications as VVC is commercialized in the future.

In HEVC, a coding tree unit (CTU) is recursively split into coding units (CUs) in a QT structure. Whether inter- or intra-prediction is to be performed is determined at the CU level, and each leaf CU is divided into two or four prediction units (PUs). The residual signal between the predicted signal and the original signal according to the determined PUs is coded in an optimal transform unit (TU) by means of recursive QT partitioning at the leaf CU again [2].

Among the various technologies adopted in VVC, the MTT block partitioning structure is one of the technologies that provides the most significant coding gains while introducing enormous complexity. The VVC test model (VTM) 1.0 introduced the MTT block structure based on the HEVC test model at an early development stage. MTT in VVC has a more flexible block partitioning structure than that of HEVC because it excludes CU, PU, and TU, concepts that are used in HEVC, and allows BT, TT, and QT to adapt various local characteristics, as shown in Figure 1 [3,4]. Figure 1 presents an example of block partitioning in VVC. If a current CU is a leaf node of QT, it can be split into QT, BT, or TT, recursively. However, if a current CU is a leaf node of BT or TT, it cannot be split into QT but can be split into BT or TT, recursively. BT and TT can be split along the horizontal and vertical directions, respectively. The rate-distortion (RD) cost competition for all possible partitions in a CU is performed to determine the best block partition. In other words, in selecting the best block, the RD cost is first computed for the current CU over the entire encoding process, including intra- or inter-prediction, transform, quantization, and entropy coding. The current block is split in the order of QT-BT-TT, and the encoding process for each sub-block is executed. The RD cost computed for each sub-block is summed and compared with the RD cost of upper partitioning, and the block structure with the minimum RD cost is selected as the best block partition. In VVC, by adopting various coding tools, not only the complexity of each tool but also the complexity of the encoding process, including the coding tools for all possible partitioned blocks, increases the encoding complexity tremendously. MTT provided 17.46% BD-rate gain and increased the encoding time by 202% in VTM 1.0 [5]. Although it provides high compression performance, the relative increase in complexity is rather significant, even when using only MTT; with the use of other technologies in the MTT block structure, the complexity increases further. Therefore, many studies on block partitioning are actively being conducted to reduce this complexity. In this paper, we propose a method for rapidly making block partitioning decisions based on block activity that measures the degree of block texture complexity.

The remainder of this paper is organized as follows. In Section 2, we provide references to several works on block partitioning in VVC. Section 3 presents the proposed approach for making fast block partitioning decisions based on activity. In Section 4, we provide the experimental results and their analysis. Section 5 presents our concluding remarks.

2. Related Works

To reduce the complexity of the HEVC encoder, various fast algorithms have been studied. In [6,7,8,9,10], information from the CU is exploited for making partitioning decisions as follows. The texture characteristics of the CU and coding information from the neighboring CUs were utilized for early termination in [6]. Pruning-based approaches were proposed in [7,8,9] by exploiting the correlations of CUs and coding information. In [7], mode selection and the bi-directional depth search method based on the sum of absolute transformed difference costs were proposed. In [8], the redundant mode was pruned, and partition checks were performed using the statistics on prediction residuals. In [9], by combining the edge complexities of the CU and its four sub-CUs, it was determined whether the CU was split or non-split. In [10], CUs were classified as natural content CUs and screen content CUs based on the statistical characteristics of the content. Then, a few modes and partitioning processes were skipped depending on the type of content.

Machine learning is widely used in computer vision for tasks such as classification and object detection, and it offers high levels of performance. Even in fast video coding algorithms, many methods are available for classifying, analyzing, and applying the features of blocks by using machine learning. Support vector machine (SVM) is one of the most widely used machine learning methods, especially for making fast partitioning decisions. In [11], a CU-splitting early termination algorithm was proposed based on a binary classification performed using the SVM technique. In [12], a model that optimizes complexity allocation at the CU level given an RD cost was designed. In [13], an adaptive fast CU size decision algorithm was proposed based on CU complexity classification by using SVM. In [14], a more precise classification method based on SVM was proposed. In the first stage of classification, the CU size is terminated early or skipped to the next depth level. Then, the CU size decision is refined by means of binary classification, which learns from previous coded frames. Convolutional neural network (CNN)-based fast algorithms have been proposed to determine the best CU partitioning [15,16]. In [15], an early-terminated hierarchical CNN was proposed; this network was trained using a hierarchical CU partition map. In [16], a CNN was used to investigate the textures of one CU and identify a promising candidate for partitioning. In addition, hardware implementation for parallel processing was considered.

As in the case of HEVC, a considerable amount of research has been conducted to reduce the encoding complexity of VVC [17,18,19,20,21,22,23,24,25,26,27,28,29,30]. With the change in the partitioning structure to MTT, methods using various approaches applied to the QT structure in HEVC have been studied. A simple early decision method was used to reduce TT complexity in [17]. In [17], by using the RD cost information obtained from BT partitioning, the TT split was determined early. A fast QT with nested multi-type tree partitioning method was proposed based on local variance and gradient [18]. In [19], a probabilistic approach based on an RD model was proposed. An RD model based on the motion divergence field was established to predict the RD cost of each partition mode. In [20], a tunable CU size decision model was proposed. The distortion was obtained as the difference between the original and predicted signals. In [21], a method for gradient-based early termination of CU partitioning was proposed. In this method, a directionality derived using various directional gradients was used to determine split modes.

To classify or decide partitioning structures, machine learning-based approaches have been studied. In [22,23], methods based on random forest classifiers (RFCs) were proposed to reduce the complexity of the MTT structure. The approach proposed in [22] was based on the RFC model, and it analyzed the energy of the current CU based on the characteristics of the texture regions to skip unnecessary intra-prediction modes. In [23], two classifiers based on RFC were designed to predict split mode decisions. In [24], features extracted from the CU were used to characterize its complexity. Then, the trained decision tree was used to predict the partition results. In [25], the block size was determined using a cascade decision tree. Moreover, intra mode decision-making using gradient descent search was introduced. In [26], a joint-classifier decision tree structure that determines partitioning parameters using local characteristics at the CTU level was proposed. Additionally, in [27], a CNN-based QT plus binary-tree partitioning decision method was proposed, in which the partitioning depth was adaptively determined. In [28], a deep MSE-CNN model was used to determine CU partitions and design an early-exit mechanism that can skip redundant RD-checking processes.

Most of the existing fast-partitioning algorithms have been studied using:

The correlation between the current block and neighboring blocks;
Information about the upper block partition;
Texture characteristics of the current block;
Machine learning-based approaches.

Each method has distinct advantages and limitations in terms of parallel processing, coding efficiency, and complexity. The information of the current block can be predicted more accurately by using the information of the neighboring blocks and/or the upper depth’s blocks, but parallel encoding may be difficult owing to dependence on neighboring blocks and/or the upper depth’s blocks. Learning-based algorithms may have offline learning complexity, and a considerable amount of information may be required for learning. However, because the current block can be predicted with high accuracy, these algorithms are effective in terms of coding efficiency. By contrast, in general, approaches in which the information of only the current block is used are less accurate than other methods, but because there is no dependence, parallel encoding is possible, and a simple partitioning decision algorithm can be designed. In addition, if various types of block information, such as neighboring information and residual signal, are used, the characteristics of such information may vary depending on the quantization parameter (QP). Therefore, the algorithm may be complex because it would be necessary to ensure that the algorithm can operate adaptively to various QPs.

In this paper, an activity-based fast block partitioning decision method is proposed. The proposed method exploits the activity to be obtained by using gradients calculated from the original sample of the current block to classify block complexity. Therefore, the proposed method enables parallel processing because it does not depend on information about other blocks and QPs, and it is VVC-friendly because it reuses the gradient calculation module used in VVC.

3. Proposed Block Partitioning Decision Method

In this section, an activity-based fast CU-partitioning decision method is described. First, the relationship between block activity and block texture characteristics is observed and analyzed to design a CU-partitioning decision algorithm. Next, the proposed method based on the aforementioned analysis is described.

3.1. Observations and Analysis: Block Activity and Texture Characteristics

Gradient is one of the features used to observe the texture characteristic of an image. Block gradient can help one to roughly grasp the edge characteristics of the block. The adaptive loop filter (ALF), one of the in-loop filters used in VVC, classifies the characteristics of blocks by using block gradients [5]. In ALF, based on the directionality and activity derived using local gradients, a filter is selected from among 25 predefined filters for a given block. To derive directionality and activity, block gradients along the horizontal, vertical, and two diagonal directions are calculated using the one-dimensional (1D) Laplacian as follows:

g_{v} = \sum_{k = i - 2}^{i + 3} \sum_{l = j - 2}^{j + 3} | 2 R (k, l) - R (k, l - 1) - R (k, l + 1) |,

(1)

g_{h} = \sum_{k = i - 2}^{i + 3} \sum_{l = j - 2}^{j + 3} | 2 R (k, l) - R (k - 1, l) - R (k + 1, l) |,

(2)

g_{d 1} = \sum_{k = i - 2}^{i + 3} \sum_{l = j - 2}^{j + 3} | 2 R (k, l) - R (k - 1, l - 1) - R (k + 1, l + 1) |,

(3)

g_{d 2} = \sum_{k = i - 2}^{i + 3} \sum_{l = j - 2}^{j + 3} | 2 R (k, l) - R (k - 1, l + 1) - R (k + 1, l - 1) |,

(4)

where

R (i, j)

denotes the reconstructed sample at position

(i, j)

within a 4 × 4 block. The calculated gradients are compared with each other to determine directionality. In addition, activity is calculated from the sum of the horizontal and vertical gradients as:

A = g_{h} + g_{v}

(5)

In this paper, we observe the texture characteristics of blocks by utilizing the gradients and activity given by (1), (2), and (5), respectively. Utilization of the gradients that are already being used in VVC makes software and hardware implementation more VVC friendly because the proposed method can use the same gradient calculation module.

To observe the relationship between the characteristics of {

g_{v}

,

g_{h}

,

A

} and the block partitions, the block partition results are overlapped with the sample-wise gradients and activity frame, as shown in Figure 2 and Figure 3. Figure 2 shows the original image, gradient frames, and activity frame of BasketballPass, one of the test sequences in the JVET common test conditions (CTCs) [31]. As shown in Figure 2, the larger the two gradients and activity values, the whiter they appear. In the case of the vertical gradient

g_{v}

, a high value is shown where a horizontal edge exists, whereas in the case of the horizontal gradient

g_{h}

, a high value is shown where a vertical edge exists. The activity,

A

, is the sum of the two gradients, meaning the complexity of sample variation. Therefore, the value of

A

is generally high where an object exists or an edge is prominent.

Figure 3 shows the block-partitions overlapped with the original image, gradients, and activity frame, respectively. In Figure 3, we compare the three observation values and the block-partitioned images. The block partitions were obtained based on the experimental encoding using VTM 10.2 with QP 37 and the all-intra (AI) configuration [31]. To distinguish the block partition modes, QTs are indicated by blue boxes, horizontal splits including BT-H and TT-H are indicated by green boxes, and vertical splits including BT-V and TT-V are indicated by red boxes. As shown in Figure 3b,c, the horizontal split is prominent in the area where

g_{v}

is strong, whereas the vertical split is prominent in the area where

g_{h}

is strong. In addition, the block size is large in the area where

A

is small, whereas many blocks are partitioned in the area where

A

is large, as shown in Figure 3d.

Based on the above observations, activity

A

may be related to the size of the split block and the degree of split. In (5),

A

is calculated as the sum of the two gradients. However, in the present paper, activity per sample

A_{s}

, defined as follows, is used to fairly compare the complexities for different block sizes:

A_{s} = \frac{1}{w + h} (g_{v} + g_{h}) .

(6)

where

w

and

h

denote the width and height of the CU, respectively. Figure 4 shows the

A_{s}

distribution according to the block size for various resolution sequences encoded with QP 27. In the

A_{s}

distributions, large-sized CUs tend to have low activity, and small-sized CUs tend to have high activity. As shown in Figure 4, we found that not only is the activity of a block high as the block is split multiple times in regions with many objects or considerable amounts of texture, but also larger block sizes have lower activity.

Figure 5 shows the cumulative probability function of activity (CDFA) as a function of QP for various sequences. As shown in Figure 5, the CDFAs of each sequence are almost similar according to QP, and the average activity of a specific CDFA value is barely changed by QP, as shown in Figure 6. This could mean that

A_{s}

does not depend on the QP and can represent the complexity of the current block. In addition, the degree of fast partitioning can be controlled by adjusting the activity threshold according to the CDFA of the current block. Therefore, in this paper, based on these observations, we propose an activity-based fast partitioning decision method that can control the partitioning decision speed with little QP dependence.

3.2. Proposed Activity-Based Fast Block Partitioning

In this section, we describe the proposed fast block partitioning decision method based on the observations described in Section 3.1. In VVC, block partitioning is performed in the order of QT-BT-TT. Then, an optimal block partition is determined through RD cost competition for all possible block partitions. The proposed method rapidly determines the block partitions by skipping the RD computation for all or parts of the BT and/or TT partitions. Because it skips the RD computation for some partitioning modes, the encoding complexity can be reduced.

In the first step, block partitioning is terminated early based on the block size and activity per sample

A_{s}

. In this step, four partitioning modes, namely {BT-H, BT-V, TT-H, TT-V}, can be skipped. Next, the block partitioning mode is determined using activity and directionality. To reduce computational complexity, including derivation of samples in the areas required for gradient calculation other than the current block, the areas outside the current block are padded with samples that are the closest to the current block.

As shown in Figure 4, the larger the block size, the lower is the activity. To quantitatively verify this tendency, we observed the statistical characteristics of activity according to the block size. Table 1 and Table 2 summarize our analysis of the encoding results shown in Figure 4. The posterior probability was calculated to determine block partitioning according to the block size and

A_{s}

. The prior probability condition for the block size was defined as follows:

ε = {\begin{matrix} 1, & i f b l o c k s a m p l e s \geq T_{b} \\ 0, & o t h e r w i s e \end{matrix},

(7)

where

T_{b}

denotes the threshold of the number of block samples and

ε

denotes the evidence. Table 1 shows the prior probability,

p (ε)

, of the sequences with different resolutions. Table 2 shows the posterior probability,

p (a | ε)

, that is, the probability of activity according to the block size.

T_{a}

denotes the threshold of activity, which is experimentally determined as the average of activity values when CDFA is 0.9 in Figure 5. The results indicate that the posterior probability increases when

ε

is 1. That is, if the block size is greater than or equal to

T_{b}

, the probability that the activity is smaller than or equal to

T_{a}

increases. For example, when

T_{b}

is equal to 512, the average prior probability,

p (ε)

, is 4.59%, but the posterior probability,

p (a | ε = 1)

, is 100%, which is a very high value. Likewise, when

T_{b}

is equal to 256, the posterior probability exceeds 95%. Thus, in the proposed method, when the number of samples in the block is greater than or equal to

T_{b}

and

A_{s}

is less than or equal to

T_{a}

, block partitioning is not performed.

There are five partitioning modes in VVC, as shown in Figure 1. Depending on the block size and activity, the decision of four partitioning modes, namely {BT-H, BT-V, TT-H, TT-V}, can be skipped. However, as summarized in Table 1, because the

p (ε = 1)

is less than 15%, it is not often possible to skip block partitioning based on block size. In other words, most blocks are split at least once. Therefore, when

ε

is not equal to 1, the second block partitioning decision step is executed.

The second step uses sub-activities based on

A_{s}

and directionality to determine the block partitioning mode. Sub-activities are defined as the activities of sub-blocks that divide the current block into four sub-blocks along two directions. Figure 7 shows the horizontal and vertical sub-activities. As observed in Section 3.1, the partitioned blocks tend to have low activity. However, the more complex content in the block, the higher is its activity, even if it is split into small block. Therefore, it is possible to identify regions that possess more texture by using sub-activities. To determine the split mode, the sub-activities

s u b_{B T}^{α}

and

s u b_{T T}^{α}

are calculated as follows:

s u b_{B T}^{α} = {A_{s u b 0}^{α} + A_{s u b 1}^{α}, A_{s u b 2}^{α} + A_{s u b 3}^{α}}, α \in {H, V},

(8)

s u b_{T T}^{α} = {A_{s u b 0}^{α} + A_{s u b 3}^{α}, A_{s u b 1}^{α} + A_{s u b 2}^{α}}, α \in {H, V},

(9)

where each

s u b A^{α}

consists of two elements that are the sum of two sub-activities to distinguish two regions for making split-mode decisions, as shown in Figure 7.

To determine the split mode, two elements of

s u b A^{α}

are compared to the activity threshold,

T_{a}

, as:

s u b A^{α} [0] \leq T_{a} xor s u b A^{α} [1] \leq T_{a} .

(10)

Equation (10) is used as a condition for making split mode decisions. In the current block, the activity for potential partition blocks can be obtained through sub-activities. Each sub-activity is compared to the activity threshold,

T_{a}

, to determine the split mode. If only one sub-activity is less than

T_{a}

, the complexity of the corresponding block is low when it is partitioned. Therefore, it is possible to skip the other split modes by predicting the split modes capable of lowering the sub-activity in the current block. If both sub-activities are less than

T_{a}

, there is no restriction on block partitioning because any direction cannot be specified. For example, if (10) is satisfied when

α

is equal to H, the current block can be split using the horizontal split mode. Similarly, if (10) is satisfied when

α

is equal to V, the current block can be split using the vertical split mode. However, if the conditions for both H and V are satisfied, the texture distribution on both sides may be similar. Therefore, any split mode is not restricted. In other words, after checking (10) for H and V, if only one of H or V satisfies the conditions, the split modes to be excluded are determined, otherwise, all split modes are allowed.

Directionality is used as an auxiliary condition to increase the accuracy of the split mode decision based on sub-activities. The directionality,

D

, is determined by comparing (1) and (2) as follows:

D = {\begin{matrix} d_{h}, & if g_{v} \geq 1.25 \times g_{h} \\ d_{v}, & if g_{h} \geq 1.25 \times g_{v} \\ n o n e, & otherwise \end{matrix},

(11)

where

d_{h}

denotes the horizontal direction, which means the current block has horizontal edges, and

d_{v}

denotes the vertical direction, meaning that the current block has vertical edges. As an example of the application of

D

, if the horizontal split mode is determined based on the sub-activity block partitioning decision, the accuracy of the split mode decision is improved by checking whether

D

is

d_{h}

as an auxiliary condition.

3.3. Flowchart of Proposed Block Partitioning Decision Method

In VVC, block partitioning proceeds in the following order: {QT, BT-H, BT-V, TT-H, TT-V}. In the proposed method, as shown in Figure 8, two steps are executed to determine block partitioning after the QT split. In the first step, the decision of the split modes {BT-H, BT-V, TT-H, TT-V} is skipped depending on the block size and activity of the current block. That is, the block partition is terminated early. Next, if the condition for skipping in the first step is not fulfilled, the second step for determining the partitioning of BT and TT is executed. In the second step, sub-activities are used with (10). There are four flags {canBT-H, canBT-V, canTT-H, canTT-V} to be set in the proposed method, and each of them allow one to determine whether a corresponding split mode is possible. For instance, in the BT partitioning decision process, if only one of the BT flags is true, the split mode indicated by the BT flag is determined as the final one, and the other split mode decisions are not executed. Because the decision process for one of the BT split modes and the two TT split modes are skipped, encoding complexity can be reduced greatly. If all of the BT flags are true or false, the TT partitioning decision process is executed. Similarly, only the split mode that is true between the TT flags can be split as the final mode. Additionally, as mentioned in Section 3.2, the directionality,

D

, can be used to increase the accuracy of the partitioning decision as an auxiliary condition. As shown in Figure 8,

D

can be checked optionally after passing the check of condition (10).

4. Experimental Results

4.1. Test Conditions

The proposed method was implemented in the VVC reference software VTM 10.2 [32] and evaluated using the JVET CTCs for standard dynamic range (SDR) video sequences [31]. Experiments were performed in the AI configuration with four QPs of {22, 27, 32, 37}. The proposed method was evaluated in terms of coding efficiency and encoding complexity, which were measured using BD-rate and encoding time saving (ETS), respectively. ETS was computed as follows:

ETS (%) = \frac{T_{o r g} - T_{p r o p}}{T_{o r g}} \times 100,

(12)

where

T_{o r g}

and

T_{p r o p}

denote the total encoding time of the original and the proposed methods, respectively.

4.2. Performance of Proposed Method

The proposed method has two thresholds, {

T_{b}

,

T_{a}

}, which are the thresholds of block size and activity, respectively. Each threshold value can affect the block partitioning decision. Although a posterior probability,

p (a | ε = 1)

, of 100% was attained in Section 3.2, the fast decision is not often applied because the prior probability,

p (ε = 1)

, is small. Therefore, in this paper,

T_{b}

was set to 256. In Section 3.2, the CDFA of the sequences was analyzed according to the QP, as shown in Figure 5. If

T_{a}

is determined according to the CDFA level, the strength of the fast algorithm can be controlled. Figure 6 shows the change in the average activity of the sequences according to the QPs for a given set of CDFA values. The average activity for a specific CDFA changes little because of QP, meaning that activity does not depend on QPs. Therefore, the value of

T_{a}

can be fixed using a CDFA. As shown in Figure 9, the average activity of each QP according to CDFA exhibits a linear characteristic and a similar appearance, regardless of QP. Therefore, we defined a model of activity threshold,

T_{a}

, by defining the base,

T_{a}

, and scaling it according to the CDFA as follows:

T_{a} = f (CDFA) \times T_{a}^{b a s e},

(13)

where

f (\cdot)

is a scale function that takes the CDFA to scale,

T_{a}^{b a s e}

, which is the average of activity when CDFA equals 0.95, and

f (\cdot)

is derived using the least-squares solution of the points shown in Figure 9. The performance of the proposed approach was evaluated and compared for various CDFAs. Moreover, the performance with and without the directionality given by (11), which is a condition for increasing the accuracy of split mode decisions, was evaluated. In this paper, the performance was evaluated for the CDFA values of {0.95, 0.90, 0.85, 0.80}.

Table 3 and Table 4 show the overall performances of the proposed method with and without the directionality condition. As shown in Table 3, without directionality, the proposed method gives up to 69.77% ETS with 4.52% BD-rate loss. Table 4 shows the performance with directionality, and the results indicate up to 68.94% ETS with 4.40% BD-rate loss. Directionality is checked to increase the precision of the split mode decision (i.e., horizontal split and vertical split). Thus, compared to the case without directionality, the overall BD-rate loss decreases, but ETS also decreases.

As shown in Figure 10, the proposed method tends to have a higher ETS as the resolution of the sequence increases. Because the proposed method determines the block partition by calculating the activity per sample, the higher the resolution of the sequence, the less complexity of the content contained in a block of limited size, which means that the proposed method can be applied more to high-resolution sequences. The proposed method can control the strength of fast block partitioning decision with CDFA. As shown in Table 3 and Table 4, ETS decreases as the CDFA decreases, because the strength of block partitioning decision is weak. Therefore, the proposed method can be applied by adjusting the CDFA according to the application or content used.

Because the proposed method calculates the activity based on the original signal, the more textured the sequence, the less applicable the proposed method. For example, even if BlowingBubbles and BasketballPass have the same resolution, in the case of BlowingBubbles, there is little background and it is full of textures. Therefore, the proposed method does not apply block partitioning skip in order to preserve the texture as much as possible. Conversely, in the case of BasketballPass, there is a monotonous background and there are not many large textures except for moving people. Therefore, block partitioning skip is applied significantly to increase the encoding speed. Figure 11 shows the reconstructed results of complex and monotonous areas by the proposed method. As shown in Figure 11, the texture is not significantly smoothed, even when compressed with high QP by the proposed method, while in the monotonous area, the important textures are kept, and others are strongly smoothed. Additionally, in E-class sequences such as broadcast- and conference-type sequences, the ETS performance is higher than 50% because the background is static and only the speaker moves.

Split mode skipping is directly related to reducing encoding complexity because the subsequent processing of prediction, transform, entropy coding, etc., performed in each split block are skipped. To analyze encoding time and encoding complexity, we observe how many block checks are performed during the encoding process and compare it with the ETS performance of the proposed method as shown in Figure 12. The block check reduction rate by the proposed method tends to be similar to the ETS performance in Table 3. For sequences BasketballDrill, RitualDance, BasketballPass, and BlowingBubbles, the block check reduction rates are 52%, 77%, 45%, and 19%, respectively. Meanwhile, the ETS performance is 45%, 68%, 47%, and 14%, respectively. Because the proposed method skips BT and TT, the overall reduction rate is determined according to the degree of reduction in BT and TT. In the case of RitualDance, as BT and TT are greatly reduced, the encoding complexity also decreases significantly. On the other hand, in the case of BlowingBubbles, as BT and TT are slightly reduced, the encoding complexity also decreases slightly.

4.3. Performance Comparison with the State-of-the-Art Works

Table 5 summarizes a performance comparison of the proposed method with state-of-the-art (SOTA) works [17,20]. In [17], a context-based ternary tree decision method (CTTD) was proposed, in which only TT partitioning is rapidly determined using the upper partition information. In [20], a tunable partitioning decision model (TDM) was proposed. First, the CU distortion is obtained using the residual between the original and predicted signals. Then, by using two decision models for BT and TT, a few split modes are skipped. For comparing the proposed method with SOTA works, the parameter of the proposed method, that is, CDFA, was set to 0.90. The comparison was performed under the AI configuration for A1, A2, B, C, and D class sequences.

The CTTD method gives 34% to 41% ETS with 0.96% to 2.88% BD-rate loss and has 37% ETS with 1.71% BD-rate loss on average. The TDM method gives 32% to 50% ETS with 0.35% to 4.43% BD-rate loss and has 36% ETS with 1.53% BD-rate loss on average. The proposed method gives 17% to 78% ETS with 0.25 to 4.21 BD-rate loss and has 43% ETS with 1.79% BD-rate loss on average.

Compared with the CTTD, the proposed method shows a higher ETS at almost the same BD-rate loss. The difference between the TDM and the proposed method is that the TDM uses a residual signal to obtain the CU distortion, and the proposed method uses the original signal to determine the complexity of the block. Because the residual signal is used in TDM, the performance of ETR and BD-rate loss is relatively stable, regardless of the complexity characteristics of sequences. It is likely that a fast prediction mode decision method may be used in addition to fast block partitioning for further encoding speed-up. The TDM process is coupled with the process of prediction mode decision because an optimal residual signal is obtained by the best prediction mode. Therefore, a fast prediction mode decision may hinder the best decision in the fast block partitioning. Meanwhile, because the proposed fast partitioning decision utilizes the complexity of the original signal, a fast prediction mode decision method can be additionally applied with the fast block partitioning method independently.

Consequently, the proposed method is designed to adjust the strength of the fast algorithm without dependence on QP with only one parameter, CDFA. Therefore, the user can adjust only one parameter to enable fast encoding suitable for a given application. In addition, because the block partitioning decision is made only with the information of the current block, parallel processing is possible without depending on neighboring information or higher-level information, and a more VVC-friendly implementation is possible by reusing the gradient calculation module in VVC.

5. Conclusions

Herein, we proposed an activity-based block partitioning decision method for fast VVC encoding. The proposed method uses information of only the current block without information about neighboring or upper blocks and minimizes the QP dependency. For these reasons, the proposed algorithm is simple and capable of parallel processing. In addition, by utilizing the gradient calculation used in VVC’s ALF, a VVC-friendly fast algorithm was designed.

The experimental results show that the proposed method significantly reduced encoding time with reasonable coding efficiency loss in VTM 10.2. Compared to SOTAs, the proposed method achieved a higher reduction in encoding time with a comparable BD-rate loss. The results give up to 78% ETS and 69% ETS on VTM 4.0 and VTM 10.2, respectively. Furthermore, to further reduce the coding loss caused by the proposed fast partitioning decision using limited information of only the current block, superior trade-off between complexity reduction and coding loss with more accurate block partitioning decision needs to be investigated. In addition, it is expected that the encoding speed can be further increased by determining prediction modes as well as the block partitioning decision by identifying the block characteristics from the original signal.

Author Contributions

Conceptualization, Y.-U.Y. and J.-G.K.; methodology, Y.-U.Y. and J.-G.K.; software, Y.-U.Y.; validation, Y.-U.Y. and J.-G.K.; formal analysis, Y.-U.Y.; investigation, Y.-U.Y.; resources, Y.-U.Y.; data curation, Y.-U.Y. and J.-G.K.; writing—original draft preparation, Y.-U.Y.; writing—review and editing, J.-G.K.; visualization, Y.-U.Y.; supervision, J.-G.K.; project administration, J.-G.K.; funding acquisition, J.-G.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by Institute of Information and Communications Technology Planning and Evaluation (IITP) grant number 2017-0-00486, and in part by National Research Foundation of Korea grant number (NRF) 2020R1F1A1068106.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bossen, F.; Li, X.; Suehring, K. AHG Report: Test Model Software Development (AHG3). Joint Video Experts Team (JVET) of ITU-T ISO/IEC, Doc. JVET-T0003. In Proceedings of the 20th Meeting, Teleconference (Online), 7–16 October 2020. [Google Scholar]
Sullivan, G.J.; Ohm, J.-R.; Han, W.-J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
Bross, B.; Chen, J.; Ohm, J.-R.; Sullivan, G.J.; Wang, Y.-K. Developments in International Video Coding Standardization after AVC, With an Overview of Versatile Video Coding (VVC). Proc. IEEE 2021, 109, 1463–1493. [Google Scholar] [CrossRef]
Chen, J.; Ye, Y.; Kim, S.H. Algorithm Description for Versatile Video Coding and Test Model 10 (VTM 10). Joint Video Experts Team (JVET) of ITU-T ISO/IEC, Doc. JVET-S2002. In Proceedings of the 19th Meeting, by teleconference (Online), 22 June–1 July 2020. [Google Scholar]
Chen, Y.-W.; Chien, W.-J.; Chuang, H.-C.; Coban, M.; Dong, J.; Egilmez, H.E.; Hu, N.; Karczewicz, M.; Ramasubramonian, A.; Rusanovskyy, D.; et al. Partition Only Software of the Video Coding Technology Proposal by Qualcomm and Technicolor. Joint Video Experts Team (JVET) of ITU-T ISO/IEC, Doc. JVET-J0075. In Proceedings of the 10th Meeting, San Diego, CA, USA, 10–20 April 2018. [Google Scholar]
Shen, L.; Zhang, Z.; An, P. Fast CU size decision and mode decision algorithm for HEVC intra coding. IEEE Trans. Consum. Electron. 2013, 59, 207–213. [Google Scholar] [CrossRef]
Gu, J.; Tang, M.; Wen, J.; Han, Y. Adaptive intra candidate selection with early depth decision for fast intra prediction in HEVC. IEEE Signal Process. Lett. 2018, 25, 159–163. [Google Scholar] [CrossRef]
Tan, H.; Ko, C.; Rahardja, S. Fast Coding Quad-Tree Decisions Using Prediction Residuals Statistics for High Efficiency Video Coding (HEVC). IEEE Trans. Broadcast. 2016, 62, 128–133. [Google Scholar] [CrossRef]
Min, B.; Cheung, R.C.C. A Fast CU Size Decision Algorithm for the HEVC Intra Encoder. IEEE Trans. Circuits Syst. Video Technol. Lett. 2015, 25, 892–896. [Google Scholar]
Lei, J.; Li, D.; Pan, Z.; Sun, Z.; Kwong, S. Fast Intra Prediction Based on Content Property Analysis for Low Complexity HEVC-Based Screen Content Coding. IEEE Trans. Broadcast. 2017, 63, 48–58. [Google Scholar] [CrossRef]
Shen, X.; Yu, L. CU splitting early termination based on weighted SVM. EURASIP J. Image Video Process. 2013, 2013, 4–14. [Google Scholar] [CrossRef]
Zhang, Y.; Kwong, S.; Wang, X.; Yuan, H.; Pan, Z.; Xu, L. Machin Learning-Based Coding Unit Depth Decisions for Flexible Complexity Allocation in High Efficiency Video Coding. IEEE Trans. Image Process. 2015, 24, 2225–2238. [Google Scholar] [CrossRef]
Liu, X.; Li, Y.; Liu, D.; Wang, P.; Yang, L.T. An Adaptive CU Size Decision Algorithm for HEVC Intra Prediction Based on Complexity Classification Using Machine Learning. IEEE Trans. Circuits Syst. Video Technol. 2019, 29, 144–155. [Google Scholar] [CrossRef]
Zhang, Y.; Pan, Z.; Li, N.; Wang, X.; Jiang, G.; Kwong, S. Effective Data Driven Coding Unit Size Decision Approaches for HEVC INTRA Coding. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 3208–3222. [Google Scholar] [CrossRef]
Xu, M.; Li, T.; Wang, Z.; Deng, X.; Yang, R.; Guan, Z. Reducing Complexity of HEVC: A Deep Learning Approach. IEEE Trans. Image Process. 2018, 27, 5044–5059. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, Z.; Yu, X.; Gao, Y.; Chen, S.; Ji, X.; Wang, D. CU Partition Mode Decision for HEVC Hardwired Intra Encoder Using Convolution Neural Network. IEEE Trans. Image Process. 2016, 25, 5088–5103. [Google Scholar] [CrossRef] [PubMed]
Park, S.-H.; Kang, J.-W. Context-Based Ternary Tree Decision Method in Versatile Video Coding for Fast Intra Coding. IEEE Access 2019, 7, 172597–172605. [Google Scholar] [CrossRef]
Fan, Y.; Chen, J.; Sun, H.; Katto, J.; Jing, M. A Fast QTMT Partition Decision Strategy for VVC Intra Prediction. IEEE Access 2020, 8, 107900–107911. [Google Scholar] [CrossRef]
Wang, Z.; Wang, S.; Zhang, J.; Wang, S.; Ma, S. Probabilistic Decision Based Block Partitioning for Future Video Coding. IEEE Trans. Image Process. 2018, 27, 1475–1486. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, Y.; Yang, G.; Song, Y.; Zhang, H.; Ding, X.; Zhang, D. Early Intra CU Size Decision for Versatile Video Coding Based on a Tunable Decision Model. IEEE Trans. Broadcast. 2021, 67, 710–720. [Google Scholar] [CrossRef]
Cui, J.; Zhang, T.; Gu, C.; Zhang, X.; Ma, S. Gradient-based Early Termination of CU Partition in VVC Intra Coding. In Proceedings of the 2020 Data Compression Conference (DCC), Snowbird, UT, USA, 24–27 March 2020; pp. 103–112. [Google Scholar]
Zhang, Q.; Wang, Y.; Huang, L.; Jiang, B. Fast CU Partition and Intra Mode Decision Method for H.266/VVC. IEEE Access 2020, 8, 117539–117550. [Google Scholar] [CrossRef]
Kulupana, G.; Kumar, V.P.; Blasi, S. Fast Versatile Video Coding using Specialised Decision Trees. In Proceedings of the 2021 Picture Coding Symposium (PCS), Bristol, UK, 29 June–2 July 2021; pp. 1–5. [Google Scholar]
Teng, G.; Xiong, D.; Ma, R.; An, P. Decision tree accelerated CTU partition algorithm for intra prediction in versatile video coding. PLoS ONE 2021, 16, e0258890. [Google Scholar] [CrossRef]
Yang, H.; Shen, L.; Dong, X.; Ding, Q.; An, P.; Jiang, G. Low-Complexity CTU Partition Structure Decision and Fast Intra Mode Decision for Versatile Video Coding. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1668–1682. [Google Scholar] [CrossRef]
Wang, Z.; Wang, S.; Zhang, J.; Wang, S.; Ma, S. Effective Quadtree Plus Binary Tree Block Partition Decision for Future Video Coding. In Proceedings of the 2017 Data Compression Conference (DCC), Snowbird, UT, USA, 4–7 April 2017; pp. 23–32. [Google Scholar]
Jin, Z.; An, P.; Yang, C.; Shen, L. Fast QTBT Partition Algorithm for Intra Frame Coding through Convolutional Neural Network. IEEE Access 2018, 6, 54660–54673. [Google Scholar] [CrossRef]
Li, T.; Xu, M.; Tang, R.; Chen, Y.; Xing, Q. Deep QTMT: A Deep Learning Approach for Fast QTMT-Based CU Partition of Intra-Mode VVC. IEEE Trans. Image Process. 2021, 30, 5377–5390. [Google Scholar] [CrossRef]
Yoon, Y.-U.; Park, D.; Kim, J.-G. A Fast Decision Method of Quadtree plus Binary Tree (QTBT) Depth in JEM. J. Broadcast. Eng. 2017, 22, 541–547. [Google Scholar]
Yoon, Y.-U.; Park, D.; Kim, J.-G. Gradient-Based Methods of Fast Intra Mode Decision and Block Partitioning. J. Broadcast. Eng. 2020, 25, 338–345. [Google Scholar]
Bossen, F.; Boyce, J.; Suehring, K.; Li, X.; Seregin, V. VTM Common Test Conditions and Software Reference Configurations for SDR Video. Joint Video Experts Team (JVET) of ITU-T ISO/IEC, Doc. JVET-T2010. In Proceedings of the 20th Meeting, Teleconference (Online), 7–16 October 2020. [Google Scholar]
VVC Test Model (VTM) Software. Available online: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags (accessed on 25 March 2022).

Figure 1. MTT of VVC: (a) MTT block partitioning modes; (b) Example of recursive block partitioning of MTT.

Figure 2. Sample-wise gradients and activity frames of the original image: (a) Original; (b) Vertical gradient,

g_{v}

; (c) Horizontal gradient,

g_{h}

; (d) Activity,

A

.

Figure 2. Sample-wise gradients and activity frames of the original image: (a) Original; (b) Vertical gradient,

g_{v}

; (c) Horizontal gradient,

g_{h}

; (d) Activity,

A

.

Figure 3. Original, sample-wise gradients, and activity with overlapping block partitions (Blue boxes: QT, Green boxes: BT-/TT-H, and Red boxes: BT-/TT-V): (a) Original; (b) Horizontal split with

g_{v}

; (c) Vertical split with

g_{h}

; (d) All split with

A

.

Figure 3. Original, sample-wise gradients, and activity with overlapping block partitions (Blue boxes: QT, Green boxes: BT-/TT-H, and Red boxes: BT-/TT-V): (a) Original; (b) Horizontal split with

g_{v}

; (c) Vertical split with

g_{h}

; (d) All split with

A

.

Figure 4. The activity distribution per sample according to the block size: (a) BasketballDrill—832 × 480; (b) BasketballDrive—1920 × 1080; (c) BQTerrace—1920 × 1080; (d) RaceHorsesC—832 × 480.

Figure 5. Cumulative distribution function of activity for various sequences according to QP: (a) QP22; (b) QP27; (c) QP32; (d) QP37.

Figure 6. Changes in average activity according to QPs at fixed value of CDFA.

Figure 7. Example of sub-activities.

Figure 8. Flowchart of proposed fast partitioning decision method.

Figure 9. The average activity changes according to CDFA.

Figure 10. ETS performance according to sequence resolution (A-class: 3840 × 2160, B-class: 1920 × 1080, C-class: 832 × 480, D-class: 416 × 240).

Figure 11. Reconstructed results of complex and monotonous areas by the proposed method: (a) BlowingBubbles—416 × 240; (b) original of complex area; (c) QP27; (d) QP37; (e) original of monotonous area; (f) QP27; (g) QP37.

Figure 12. Comparisons of the block check reduction rate and ETS of the proposed method: (a) BasketballDrill; (b) RitualDance; (c) BasketballPass; (d) BlowingBubbles.

Table 1. Prior probability of block size (QP = 27).

Sequence	p (ε = 1; T_b = 512)	p (ε = 1; T_b = 256)
BasketballDrill	0.29%	3.35%
RaceHorsesC	2.15%	8.08%
BasketballDrive	13.56%	35.70%
BQTerrace	2.38%	6.12%
Average	4.59%	13.31%

Table 2. Posterior probability with activity given

ε

(QP = 27).

Table 2. Posterior probability with activity given

ε

(QP = 27).

Sequence	p (a ≤ T_a\|ε = 1; T_b = 512)	p (a ≤ T_a\|ε = 1; T_b = 256)
	T_a = 15
BasketballDrill	100%	99.51%
RaceHorsesC	100%	94.86%
BasketballDrive	100%	96.43%
BQTerrace	100%	99.93%
Average	100%	97.68%

Table 3. Performance of proposed method without directionality on VTM 10.2.

		CDFA = 0.95		CDFA = 0.90		CDFA = 0.85		CDFA = 0.80
Class	Sequence	BD-Rate	ETS	BD-Rate	ETS	BD-Rate	ETS	BD-Rate	ETS
A1 (3840 × 2160)	Tango2	3.75%	64.70%	3.67%	63.81%	3.38%	60.90%	1.48%	20.43%
	FoodMarket4	3.33%	64.56%	3.26%	63.03%	3.08%	60.09%	2.64%	47.40%
	Campfire	1.98%	31.54%	1.54%	24.02%	1.11%	17.02%	0.62%	2.85%
A2 (3840 × 2160)	CatRobot1	4.36%	52.63%	2.59%	35.11%	1.35%	19.54%	0.72%	4.87%
	DaylightRoad2	3.66%	53.45%	0.86%	16.60%	0.13%	5.16%	0.04%	4.74%
	ParkRunning3	2.04%	55.08%	1.91%	50.12%	1.73%	45.34%	1.42%	32.85%
B (1920 × 1080)	MarketPlace	2.55%	60.23%	2.24%	52.32%	1.74%	41.85%	0.80%	21.99%
	RitualDance	4.52%	69.77%	4.20%	65.66%	3.77%	60.27%	3.05%	50.66%
	Cactus	2.34%	45.81%	1.17%	30.80%	0.31%	14.32%	0.02%	1.32%
	BasketballDrive	2.94%	53.20%	1.84%	41.46%	0.69%	21.39%	0.07%	2.72%
	BQTerrace	0.88%	28.15%	0.43%	15.52%	0.22%	8.52%	0.12%	3.19%
C (832 × 480)	BasketballDrill	4.15%	48.22%	2.97%	36.70%	1.56%	25.58%	0.49%	9.98%
	BQMall	2.82%	40.03%	1.88%	29.62%	1.05%	22.38%	0.31%	9.79%
	PartyScene	0.44%	19.12%	0.23%	10.77%	0.12%	8.49%	0.04%	3.50%
	RaceHorsesC	1.61%	38.07%	1.28%	31.38%	0.96%	28.49%	0.59%	19.30%
D (416 × 240)	BasketballPass	3.02%	50.28%	2.29%	40.79%	1.58%	34.70%	0.82%	20.12%
	BQSquare	0.52%	19.28%	0.34%	11.92%	0.17%	9.87%	0.08%	3.89%
	BlowingBubbles	0.42%	16.81%	0.22%	9.07%	0.11%	8.90%	0.01%	4.09%
	RaceHorses	1.68%	38.76%	1.30%	30.13%	0.98%	28.53%	0.68%	20.07%
E (1280 × 720)	FourPeople	5.11%	58.92%	4.30%	50.72%	3.45%	42.02%	2.19%	28.86%
	KristenAndSara	4.90%	56.14%	3.88%	45.50%	2.98%	35.82%	1.86%	25.83%
	Johnny	4.00%	54.67%	3.41%	46.81%	2.79%	39.91%	2.00%	29.06%
F	BasketballDrillText	3.69%	42.84%	2.77%	33.66%	1.58%	21.33%	0.57%	8.52%
	ChinaSpeed	4.22%	50.78%	3.75%	45.96%	3.16%	39.58%	2.35%	30.38%
	SlideEditing	1.26%	24.56%	1.03%	22.23%	0.88%	19.88%	0.78%	17.69%
	SlideShow	2.69%	36.21%	2.39%	32.82%	2.12%	29.18%	1.70%	25.61%
	Average	2.80%	45.15%	2.14%	36.02%	1.58%	28.81%	0.98%	17.30%

Table 4. Performance of proposed method with directionality on VTM 10.2.

		CDFA = 0.95		CDFA = 0.90		CDFA = 0.85		CDFA = 0.80
Class	Sequence	BD-Rate	ETS	BD-Rate	ETS	BD-Rate	ETS	BD-Rate	ETS
A1 (3840 × 2160)	Tango2	3.75%	63.57%	3.65%	62.00%	3.28%	56.21%	1.15%	8.53%
	FoodMarket4	3.33%	63.97%	3.25%	59.98%	3.02%	55.90%	2.53%	2.97%
	Campfire	1.83%	27.52%	1.41%	17.49%	0.98%	9.68%	0.51%	2.17%
A2 (3840 × 2160)	CatRobot1	4.12%	51.09%	2.20%	24.47%	1.18%	12.14%	0.61%	1.66%
	DaylightRoad2	3.31%	46.00%	0.47%	6.40%	0.08%	0.40%	0.05%	0.12%
	ParkRunning3	2.00%	53.12%	1.85%	45.57%	1.66%	40.05%	1.34%	29.92%
B (1920 × 1080)	MarketPlace	2.50%	58.77%	2.17%	47.94%	1.64%	39.36%	0.65%	16.45%
	RitualDance	4.40%	68.94%	4.07%	61.90%	3.62%	57.91%	2.85%	46.98%
	Cactus	2.03%	41.47%	0.93%	22.76%	0.18%	9.83%	0.00%	0.58%
	BasketballDrive	2.77%	50.95%	1.63%	35.84%	0.49%	17.09%	0.04%	2.07%
	BQTerrace	0.73%	24.68%	0.35%	9.55%	0.19%	6.83%	0.09%	2.15%
C (832 × 480)	BasketballDrill	3.73%	45.11%	2.48%	29.78%	1.21%	12.42%	0.30%	6.84%
	BQMall	2.56%	37.28%	1.62%	23.86%	0.86%	8.66%	0.22%	4.88%
	PartyScene	0.34%	17.06%	0.16%	6.84%	0.07%	3.28%	0.04%	1.56%
	RaceHorsesC	1.48%	36.07%	1.17%	26.24%	0.81%	17.16%	0.43%	1.03%
D (416 × 240)	BasketballPass	2.73%	47.16%	2.03%	36.64%	1.33%	21.99%	0.73%	5.22%
	BQSquare	0.40%	16.59%	0.21%	5.96%	0.13%	1.59%	0.05%	0.85%
	BlowingBubbles	0.27%	14.47%	0.12%	4.21%	0.04%	2.52%	0.01%	1.23%
	RaceHorses	1.48%	35.74%	1.13%	26.04%	0.82%	16.07%	0.53%	7.38%
E (1280 × 720)	FourPeople	4.87%	55.12%	4.04%	46.60%	3.12%	36.49%	1.95%	22.81%
	KristenAndSara	4.65%	51.88%	3.63%	42.82%	2.67%	32.17%	1.56%	19.62%
	Johnny	3.82%	51.45%	3.22%	43.21%	2.62%	35.18%	1.79%	23.66%
F	BasketballDrillText	3.34%	38.98%	2.32%	27.81%	1.17%	15.16%	0.38%	4.27%
	ChinaSpeed	3.98%	48.55%	3.49%	42.76%	2.91%	35.45%	2.07%	25.74%
	SlideEditing	0.78%	16.89%	0.66%	14.37%	0.53%	13.04%	0.44%	12.08%
	SlideShow	2.50%	30.41%	2.18%	26.08%	1.83%	23.03%	1.46%	20.70%
	Average	2.60%	42.03%	1.94%	30.66%	1.40%	22.29%	0.84%	10.44%

Table 5. The comparison of state-of-the-art works and the proposed method on VTM 4.0.

		CTTD [17]		TDM [20]		Proposed Method
Class	Sequence	BD-Rate (%)	ETS (%)	BD-Rate (%)	ETS (%)	BD-Rate (%)	ETS (%)
A1	Tango2	1.55%	34%	1.12%	35%	3.70%	78%
	FoodMarket4	1.39%	37%	1.00%	32%	3.14%	78%
	Campfire	1.25%	36%	4.43%	50%	1.28%	29%
A2	CatRobot1	2.20%	37%	1.55%	35%	2.58%	41%
	DaylightRoad2	1.54%	37%	1.47%	37%	0.48%	19%
	ParkRunning3	1.22%	38%	0.35%	37%	1.72%	59%
B	MarketPlace	0.96%	38%	1.23%	35%	2.15%	59%
	RitualDance	1.67%	36%	1.38%	37%	4.21%	75%
	Cactus	1.85%	38%	1.36%	36%	1.21%	36%
	BasketballDrive	1.86%	36%	1.64%	37%	2.07%	49%
	BQTerrace	1.36%	41%	1.30%	36%	0.41%	19%
C	BasketballDrill	2.88%	40%	2.78%	37%	3.07%	47%
	BQMall	2.29%	38%	1.62%	37%	2.00%	40%
	PartyScene	1.48%	40%	1.27%	40%	0.29%	19%
	RaceHorsesC	1.81%	39%	1.24%	36%	1.37%	42%
D	BasketballPass	2.03%	36%	1.69%	37%	2.43%	53%
	BQSquare	1.33%	38%	1.00%	37%	0.35%	19%
	BlowingBubbles	1.67%	40%	1.28%	38%	0.25%	17%
	RaceHorses	2.08%	40%	1.40%	35%	1.40%	40%
	Average	1.71%	37%	1.53%	36%	1.79%	43%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yoon, Y.-U.; Kim, J.-G. Activity-Based Block Partitioning Decision Method for Versatile Video Coding. Electronics 2022, 11, 1061. https://doi.org/10.3390/electronics11071061

AMA Style

Yoon Y-U, Kim J-G. Activity-Based Block Partitioning Decision Method for Versatile Video Coding. Electronics. 2022; 11(7):1061. https://doi.org/10.3390/electronics11071061

Chicago/Turabian Style

Yoon, Yong-Uk, and Jae-Gon Kim. 2022. "Activity-Based Block Partitioning Decision Method for Versatile Video Coding" Electronics 11, no. 7: 1061. https://doi.org/10.3390/electronics11071061

APA Style

Yoon, Y.-U., & Kim, J.-G. (2022). Activity-Based Block Partitioning Decision Method for Versatile Video Coding. Electronics, 11(7), 1061. https://doi.org/10.3390/electronics11071061

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Activity-Based Block Partitioning Decision Method for Versatile Video Coding

Abstract

1. Introduction

2. Related Works

3. Proposed Block Partitioning Decision Method

3.1. Observations and Analysis: Block Activity and Texture Characteristics

3.2. Proposed Activity-Based Fast Block Partitioning

3.3. Flowchart of Proposed Block Partitioning Decision Method

4. Experimental Results

4.1. Test Conditions

4.2. Performance of Proposed Method

4.3. Performance Comparison with the State-of-the-Art Works

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI