Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions

Uhrina, Miroslav; Sevcik, Lukas; Bienik, Juraj; Smatanova, Lenka

doi:10.3390/electronics13050953

Open AccessArticle

Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions

Faculty of Electrical Engineering and Information Technology, University of Zilina, Univerzitna 1, 010 26 Zilina, Slovakia

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(5), 953; https://doi.org/10.3390/electronics13050953

Submission received: 9 January 2024 / Revised: 15 February 2024 / Accepted: 28 February 2024 / Published: 1 March 2024

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Over the years, there has been growing interest in multimedia services, especially in the video domain, where firms and subscribers require higher resolutions, framerates, and sampling precision. This results in a huge amount of data that needs to be processed, stored, and transmitted. As a result, researchers face the challenge of developing new compression standards that can reduce the amount of data while maintaining the same quality. In this paper, the compression performance of the latest and most commonly used video codecs, namely H.266/VVC, AV1, H265/HEVC, and H.264/AVC was examined. The test set included seven sequences of various content at 8K, Ultra HD (UHD), and Full HD (FHD) resolutions, encoded to bitrates ranging from 1 to 15 Mbps for FHD and UHD resolutions and from 5 to 50 Mbps for 8K resolution. Objective quality metrics, such as peak signal-to-noise ratio (PSNR), the structural similarity index (SSIM), and video multi-method assessment fusion (VMAF) were used to measure codec performance. The results showed that H.266/VVC outperformed all other codecs, namely H.264/AVC, H.265/HEVC, and AV1, in terms of the Bjøntegaard delta (BD) model. The average bitrate savings were approximately 78% for H.266/VVC, 63% for AV1, and 53% for H.265/HEVC relative to H.264/AVC, 59% for H.266/VVC and 22% for AV1 compared to H.264/AVC, and 46% for H.266/VVC relative to AV1 (all for 8K resolution). The results also showed that codec performance varied depending on resolution, with higher resolutions showing greater efficiency for newly developed codecs, such as H.266/VVC and AV1. This confirms the fact that the H.266/VVC and AV1 codecs were primarily developed for videos at high resolutions, such as 8K and/or UHD.

Keywords:

H.264/AVC; H.265/HEVC; H.266/VVC; AV1; QoE; objective assessment; PSNR; SSIM; VMAF; FHD; UHD; 8K

1. Introduction

In recent years, there has been a significant increase in demand for multimedia services, especially in the video field. Both subscribers and firms require higher resolutions, frame rates, and sampling precision, which is becoming a common part of video broadcasting and streaming. The research community is currently focusing more on 8K resolution. Higher frame rates are also in demand, particularly from post-production companies. Additionally, high dynamic range (HDR) technology has emerged as a common feature that can significantly improve image and video quality. However, these parameters have a significant impact on the final bitrate and bandwidth, as processing, storing, and transmitting such a vast amount of data becomes a major challenge for industry, researchers, and companies. Therefore, there is a need to develop new compression techniques and standards that can reduce the amount of data while maintaining the perceived quality.

Versatile video coding (VVC), also known as H.266 or MPEG-I Part 3, is the newest video codec from the MPEG family group and is a successor to high-efficiency video coding (HEVC). It was developed in 2020 by the Joint Video Experts Team (JVET), a joint video expert team of the VCEG working group of ITU-T Study Group 16, and the MPEG working group of ISO [1].

The biggest advantage of the VVC codec, as its name implies, is its versatility, which means the efficient encoding of a wide range of video content and applications. Although it is not widely used at present, it holds great promise for the future [2] AOMedia Video 1 (AV1) is an open, royalty-free video codec that was developed in 2015 by the Alliance for Open Media (AOMedia) to succeed Google’s VP9 codec. AOMedia is a huge consortium that includes many companies, providers, video content producers, software development firms, and web browser vendors [3,4]. High-efficiency video coding (HEVC), also known as H.265 or MPEG-H Part 2, is a video compression standard that was developed in 2013 to overcome H.264/AVC codec. HEVC is a video project of the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) standardization organizations, working together in a partnership known as the Joint Collaborative Team on Video Coding (JCT-VC) [4,5]. Advanced video coding (AVC), also known as H.264 or MPEG-4 Part 10, is a video compression standard of the same standardization organizations as the HEVC codec. Although it was developed in 2003, it remains one of the most popular codecs for the recording, compression, and distribution of video content [6].

Recently, many experts and researchers have provided quality performance analyses of the aforementioned codecs. The authors of [7,8] performed an analysis between the HEVC and VVC codecs for test sequences, with the resolution ranging from 240p up to Ultra HD (UHD), and in [9], from 480p to UHD resolution, respectively. In both articles, the compression efficiency was evaluated by using the peak signal-to-noise ratio (PSNR) objective method. In [10], the rate-distortion analysis of the same codecs using the PSNR, structural similarity index (SSIM), and video multi-method assessment fusion (VMAF) quality metrics was provided. The authors of [11,12] assessed the video quality of the HEVC, VVC, and AV1 compression standards for test sequences with resolutions that varied from 240p to UHD/4K, and in [13,14], at Full HD (FHD) and Ultra HD (UHD) resolutions, respectively. The compression efficiency was calculated using the PSNR objective metric. In [13,14], the multi-scale structural similarity index (MS-SSIM) method was also for the quality evaluation. The authors of [15] provided a comparative experimental study of HEVC, VVC, and AV1, as well as the H.264 codec for the Full HD (FHD) test sequences. A rate-distortion analysis was provided by the PSNR, SSIM, and VMAF objective metrics. The authors of [16] offer a comparative evaluation of the compression efficiency of H.264, HEVC, VVC, AV1, and VP9, but only for video sequences at 480p resolution. In [17], a comparative performance assessment of five video codecs—HEVC, VVC, AV1, EVC, and VP9—is presented. The experimental evaluation was carried out on three video datasets with three different resolutions: 768 × 432, 560 × 488, and 3840 × 2160 (UHD). In [18], the authors compared the performances of the HEVC, VVC, AV1, and VP9 codecs using video sequences from three different datasets with resolutions ranging from 480p up to Full HD (FHD). The rate-distortion analysis was performed by using both the PSNR and VMAF objective quality metrics. The authors of [19] provide an objective performance evaluation of the HEVC, JEM, AV1, and VP9 codecs, which was carried out using the PSNR metric. A large test set of 28 video sequences with different resolutions varying from 240p to Ultra HD (UHD) was generated. In [20], the coding performance of the HEVC, VVC, VP9, and AVS3 codecs is described. The datasets cover a wide range of video sequences at resolutions up to 4K. The authors of [21] examined the compression performance of three codecs, namely HEVC, VVC, and AV1, which were measured by using the PSNR and SSIM objective video quality metrics. In [22], the authors compared the coding performance of HEVC, EVC, VVC, and AV1 in terms of computational complexity.

Although the two articles mentioned above investigated the quality performance of the latest codecs using 8K video test sequences, the number of used sequences is still low. From the survey, it is apparent that a complex performance evaluation of popular codecs at 8K resolution is missing. Therefore, we have decided to conduct an objective assessment of well-known codecs in various resolutions and bitrates.

This paper aims to evaluate the performance of four commonly used video codecs—H.264/AVC, H.265/HEVC, H.266/VVC, and AV1. The assessment was conducted on seven different test sequences that have diverse spatial information (SI) and temporal information (TI) values. The codecs were tested at different bitrates for three different resolutions—8K, Ultra HD (UHD), and Full HD (FHD). The quality was evaluated using the PSNR, SSIM, and VMAF objective metrics.

The remainder of the paper is organized as follows. Section 2 focuses on the experiment setup, where the used dataset, video encoding, and objective quality evaluation are described. Section 3 deals with the analysis of the results, and Section 4 provides the conclusion.

2. Experiment Setup

2.1. Dataset Description

There are many factors that can influence the results. Firstly, the selection of the test set, which means the test sequences, plays an important role. The more complex the test sequence, in terms of the SI-TI parameters, the higher the difficulty of the encoding process. On the contrary, sequences with slow motion or a small amount of spatial details can be coded with higher efficiency. The resolution, bit depth, framerate, i.e., the number of frames per second, and color space are input factors that can also affect the final results. Last but not least, the experimental setup, in terms of the adjustment of the encoding parameters, can influence the results. This includes, for instance, the group of pictures (GoPs) setting, encoding quality choice (either quantization parameter (QP) or bitrate (BR) constrain), the selection of encoding modes (constant bitrate (CBR), variable bitrate (VBR), or adaptive bitrate (ABR)), the choice of rate control modes (1-pass, 2-pass, or CRF encoding), and the selection of used presets, tunes, or profiles.

In our experiments, we used sequences from three different databases. As far as we know, these three datasets are the only ones that contain test sequences at 8K resolution. The first dataset, called “Fraunhofer”, was created in 2019 by the Fraunhofer HHI and is publicly available at [23]. It is a set of seven 8K video sequences in BT.2020 SDR and BT.2100 PQ HDR versions. The sequences were recorded using the RED DSMC2 camera with a Helium 8K S35 35.4 Megapixel CMOS Sensor. The second dataset, called “SEPE”, contains 40 video sequences that were captured using a Canon R5C video camera in 10-bit Canon Cinema RAW Light (CRM) format, converted into the 8-bit RGBA pixel format and encoded in the lossless PNG format using Adobe Premiere Pro. The whole dataset is available in [24]. The third dataset, called PP8K, contains 16 8K video sequences that were collected using a professional 8K camera (Sharp 8 C-B60A camcorder), which provides 16-bit raw data. Afterward, for practical use, they were converted into a 10-bit 4:2:0 YUV format by FFmpeg, where the command line was the following:

ffmpeg -f image2 -i 0000%d.tif -s 7680x4320 -pix_fmt yuv420p10le SeqName.yuv

The YUV, also known as YPbPr in analog or YCbCr in digital video, is a color model composed of luma (Y) and two chroma (UV) components. In the past, it was primarily used on an analog television. Nowadays, it is used for chroma subsampling, which is a type of compression that reduces the color information in a video signal in favor of luminance (Y) data in order to reduce bandwidth usage without significantly affecting picture quality. There are many types of subsampling modes, for instance, 4:2:2, 4:2:0, or 4:1:1. In our experiments, we have used the 4:2:0 subsampling format, which means the bandwidth of a video signal is reduced by half compared to no chroma subsampling signal.

The color space of these sequences is ITU-R BT.2020 and is available in [25]. All the technical parameters, such as resolution, color space, bit depth, and framerate, as well as the duration of all three datasets, are presented in Table 1.

The datasets have great diversity and universality. In order to evaluate the performance of all sequences, the spatial-temporal analysis, according to [26], should be performed. The spatial perceptual information (SI) indicates the amount of spatial detail in a video frame and is based on the Sobel filter. It is generally higher for more spatial scene content. The temporal perceptual information (TI) indicates the amount of temporal changes in a video sequence. This change is called the motion difference property and is defined as a function of time. It is generally higher for high-motion video sequences. Both SI and TI are calculated for the luminance part of the video frames only [26].

Since the spatial and temporal information varies with resolution, one SI-TI diagram should be drawn for every resolution. In our research, we decided to explore the video quality not only at 8K resolution but also at Ultra HD (UHD) and Full HD (FHD) resolution. For this reason, we had to downscale all test sequences from 8K resolution to UHD and FHD resolutions, respectively. For this purpose, we used the FFmpeg tool [27] and the following command line:

ffmpeg -f rawvdideo -video_size {resolution} -pixel_format {color space} -framerate {framerate} -i {input_sequence} -vf scale={resolution} -c:v rawvideo {output_sequence}

All three SI-TI diagrams, one for each resolution using the [28], are shown in Figure 1. In all these plots, all the sequences of all three datasets are depicted. Particular datasets are distinguished by different markers, namely the squares representing the Fraunhofer dataset, the circles representing the SEPE dataset, and the asterisks representing the PP8K dataset; each sequence is marked by a different color. Before the serial number of each sequence, a letter representing the abbreviation of the dataset is written, namely “F” for “Fraunhofer”, “S” for SEPE, and “P” for the“PP8K” dataset.

The entire dataset contains 63 test sequences altogether for each resolution. For our study, we had to select some of them. In order to cover all SI-TI diagrams, we decided to choose those sequences with a wide variety of SI and TI values. We picked one sequence from each of the four corners, namely “Cooking”, “TiergartenParkway”, “BodeMuseum”, and “Koi”, and three sequences from the middle of the SI-TI plot, namely “NeptuneFountain3”, “Giraffe”, and “36”. A short description of the characteristics (with given SI-TI values) of the selected test sequences is shown in Table 2, and their previews are given in Figure 2.

2.2. Video Encoding

For our study, all the test sequences were encoded to the tested compression standards, namely H.264/AVC, H.265/HEVC, H.266/VVC, and AV1 using the FFmpeg tool [27]. Since there is no support for VVC in FFmpeg, according to [29], the patch that included VVC support was manually submitted to FFmpeg. The bitrate range was set to 1, 3, 5, 7, 10, and 15 Mbps for FHD and UHD resolutions and to 5, 7, 10, 15, 30, and 50 Mbps for 8K resolution, respectively. Altogether, 420 test sequences were encoded. The group of pictures (GoPs) was set to 60 for the Fraunhofer and PP8K datasets and 30 for the SEPE dataset, which determines the condition that every second, an intraframe (I) frame occurs. The structure of the GoPs is referred to by two numbers—N and M, where N stands for the distance between two keyframes, i.e., I frames, also known as the length of the GoPs, and M stands for the distance between two anchor frames (I or P). Since the GoPs settings, except the I frame distance, were set as default by all codecs, the GoPs structure was as follows: the H.264/AVC N = 60 and the M = 4, the H.265/HEVC N = 60 and M = 5, and the AV1 N = 1 and M = 1. For H.266/VVC, the intraperiod was set to 64, perceptual optimization was enabled, and the decoding refresh type was CRA, which stands for clean random access. For all codecs, no certain profiles and presets were set within the encoding process. All the encoding parameters are displayed in Table 3. We decided to use average bitrate coding (ABR), which means that only the target bitrate parameter was set. Since the H.264/AVC and H.265/HEVC codecs were not designed to encode videos at 8K resolution, two-pass encoding was used to ensure that the target bitrate was as accurate as possible for these two standards. The command lines that show the encoding settings for particular codecs are listed in Table 4. A flowchart highlighting the encoding and evaluating process is depicted in Figure 3.

2.3. Objective Quality Evaluation

A quality performance evaluation was conducted, and the following objective metrics were used: peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and video multi-method assessment fusion (VMAF). Although PSNR belongs to one of the oldest metrics and its results do not correlate well with subjective methods, it is still used. One reason is that this metric is very easy and fast for calculations, and the second is that it is used for the computation of the BD rate, which stands for the Bjøntegaard delta rate. The output value of this metric is given in decibels [dB]. Based on the measurements, the maximum value that PSNR can achieve is 100 dB, which essentially equals the quality of a reference video. The SSIM metric considers video degradation as a perceived change in structural information; it measures the distortions in the image structure resulting from changes in brightness, contrast, and the blurring of the image. This measurement is based on the assumption that the human visual system detects structural changes in a frame better than it identifies definite errors, which leads to a higher correlation with subjective quality evaluations. The results obtained using the SSIM metrics fall within the interval [0, 1], where 1 represents the best quality that can be achieved only if all the compared images or videos are identical [30]. VMAF is a video quality metric developed by Netflix in collaboration with the University of Southern California, the IPI/LS2N lab at Nantes Université, and the Laboratory for Image and Video Engineering (LIVE) at The University of Texas at Austin. It uses extant video quality metrics and other properties to predict video quality: visual information fidelity (VIF), detail loss metric (DLM), and mean co-located pixel difference (MCPD). All features are concatenated using SVM-based regression to determine an output score ranging from 0 to 100 per video frame, where 100 represents identical quality compared to a reference video [31]. All three of the above-mentioned metrics belong to so-called full-reference objective methods, which means that both the reference as well as the tested image/sequence must be available for evaluation, and the quality is computed as a direct comparison between both images or sequences. The objective assessment was conducted using the FFMetric tool, which can be used to calculate various visual quality metrics. FFMetrics is an FFmpeg GUI designed to visualize quality metrics calculated by FFmpeg. The tool is free to use and can be downloaded from [32].

3. Analysis of the Results

The results are presented in two parts. In the first one, we selected PSNR to compute the bitrate savings in terms of the Bjøntegaard delta (BD-rate) model [33], where the BD-rate (in percentage), as well as BD-PSNR (in dB), are calculated from the area located between two rate-distortion (RD) curves. The results are shown in Table 5, Table 6, Table 7, Table 8, Table 9 and Table 10, where each of these tables represents values at specific resolutions, i.e., 8K, Ultra HD, and Full HD, respectively. In each table, the bitrate savings for a particular test sequence are calculated and listed. In addition, Table 11 and Table 12 present the averaged bitrate savings depending on a specific codec. In the second part, we generated rate distortion plots using the aforementioned SSIM and VMAF quality metrics, as seen in Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9. In each of these figures, seven plots are provided, one for each test sequence, which depict comparisons between the four codecs analyzed in our study. Each codec is highlighted in a different color, where red represents H.264/AVC, green represents H.265/HEVC, blue represents H.266/VVC, and cyan represents AV1. Figure 4, Figure 5 and Figure 6 show the SSIM rate distortion plots for the 8K, UHD, and FHD resolutions, whereas Figure 7, Figure 8 and Figure 9 depict graphs of the VMAF rate distortion for the same above-mentioned resolutions.

When looking at the results, we can notice that H.266/VVC outperforms all other codecs, namely H.264/AVC, H.265/HEVC, and AV1, respectively. The biggest difference, in terms of bitrate savings, is between VVC and H.264, starting at around 59% at FHD and ending at about 93% at 8K resolution. VVC also overcomes HEVC and varies from around 32% at UHD to about 78% at 8K resolution. AV1 appears to be the second most effective codec, as it outperforms H.264, ranging from around 33% at FHD to about 81% at 8K resolution, as well as HEVC, varying from 3% at FHD to around 44% at 8K resolution. The bitrate savings between the two currently most used compression standards, namely HEVC and H.264, start from around 14% at FHD to about 77% at 8K resolution. The difference in bitrate savings between the newly developed codecs, namely VVC and AV1, begins from around 1% for UHD and ends at about 70% at 8K resolution, which confirms that the H.266/VVC compression standard has the best coding efficiency and is promising for future utilization in storing and transmitting video content at 8K resolution. All the provided results vary depending on the test sequence used. Given the results, we can state that the bitrate savings between particular standards is greatest for sequences with low SI values, such as “Koi” or “Cooking” at 8K resolution, with approximately 69% and 67%; on the contrary, the smallest difference for sequences located in the middle of the SI-TI diagram were for “NeptuneFountain3” or “Giraffe” at 8K resolution at about 44% and 48%, respectively. Considering the results, it is also obvious that the bitrate savings between the newly developed compression standards (H.266/VVC or AV1) and the rest (H.265/HEVC or H.264/AVC) vary by resolution, with the largest at 8K resolution and gradually decreasing towards FHD resolution. This trend is also confirmed by Figure 10, where each bar represents a value from Table 11. The resolution is highlighted by color, where yellow represents 8K resolution, red represents Ultra HD resolution, and blue represents Full HD resolution. This confirms the fact that the VVC and AV1 codecs have been primarily developed for videos at high resolutions, such as 8K and/or UHD.

The reasons why the newly developed codecs—AV1 and H.266/VVC—achieve better video quality at high resolutions are given in the following lines.

Since the H.266/VVC codec was developed to be versatile, it comes with several new functionalities. One of them is random access capability, which refers to the ability to start consuming video content from positions other than at the very beginning of the bitstream. VVC also allows the spatial resolution to change at inter-coded pictures through the support of the feature referred to as reference picture resampling (RPR). In VVC, the coding tree units (CTUs), which are the basic processing units within a frame, can be larger than in HEVC, but the concept is the same and is similar to the approach of a macroblock in AVC. While the basic, well-known block-based hybrid video coding scheme used in all previous MPEG standards has been retained in VVC, the core compression technologies have been improved in some ways. First, we can mention the quadtree partitioning of a CTU, which has been extended by enabling more flexible partitioning and supporting larger block sizes. In intra-picture prediction, VVC also contains finer-granularity angular prediction for 93 angle modes, except the DC and planar modes, which are similar to HEVC and other matrix-based prediction modes for luma, as well as the cross-component prediction modes for chroma samples. In interframe prediction, VVC uses either a single motion vector (MV) nonprediction, referencing a frame in a list of previously decoded reference frames, or bi-prediction, using two MVs. In addition to this, VVC offers a variety of new coding tools for the more efficient representation, prediction, and coding of motion compensation control information, as well as enhancing the motion compensation processing itself. These techniques can be categorized into advances in coding motion information, advances in CU-level motion compensation, improved motion compensation processes using subblock-based motion derivation and prediction refinement at the decoder, and horizontal wrap-around motion compensation. In transform and quantization, VVC uses the same concept as HEVC, but VVC achieves better energy compaction of the prediction residual by extended transforms complemented by improved quantization and residual coding. The entropy coding in VVC is achieved by using CABAC coding, as in HEVC, but the efficiency is refined by some changes in the coefficient coding and probability estimation. In VVC, refined and new in-loop filters are used to reach better visual quality. Last but not least, VVC contains special coding tools that increase coding efficiency [2].

AV1 has larger superblock partitioning of up to 128 × 128 luma samples. It can be partitioned into smaller block sizes, where the minimum block size is extended to 4 × 4 luma samples, which provides more coding flexibility. AV1 also enables a two-stage block partitioning search, where the first pass starts from the largest block size. In intraframe prediction, AV1 extends the directional Intra prediction options to support higher granularity and adds a new smooth prediction mode in nondirectional smooth intra prediction. AV1 also allows intraframe motion-compensated prediction, which uses the previously coded pixels within the same frame, namely intrablock copy (IntraBC). Moreover, other features increase coding efficiency, such as chroma from luma prediction models and the color palette mode. In interframe prediction, AV1 supports many toolsets to exploit the temporal correlation in video signals, which include adaptive filtering in translational motion compensation, affine motion compensation, and highly flexible compound prediction modes. AV1, unlike the older codecs, employs a dynamic motion vector referencing scheme that obtains candidate motion vectors from the spatial and temporal neighbors and ranks them for efficient entropy coding. In transform coding, AV1 extends its flexibility in terms of both the transform block sizes and kernels, where it extends the maximum transform block size to 64 × 64. In terms of kernels, AV1 allows each transform block to choose its own transform kernel independently. In the quantization step, where the transform coefficients are quantized, the quantization parameter (QP) ranges between 0 and 255. In the entropy coding system, AV1 employs the M-ary symbol arithmetic coding method, which was basically developed for the Daala codec. In postprocessing, AV1 allows three optional in-loop filter stages: a deblocking filter, a constrained directional enhancement filter (CDEF), and a loop restoration filter [3].

All the above-mentioned features help both codecs achieve better coding efficiency and quality, especially for high-resolution video content.

Flow charts for each coding standard discussed in the manuscript are depicted in Figure 11 [2,3,5,6].

Apart from the bitrate savings, we have also noticed and compared the processing time of all codecs at all resolutions. For this analysis, we decided to choose the “NeptuneFountain3” test sequence, which is situated in the middle of the SI-TI plot. Table 13 describes the parameters of the PC where the experimental setup was carried out. Table 14 shows coding time, in seconds, and Table 15 shows a comparison of coding times relative to the AV1 codec, which achieved the shortest time. As can be seen, H.264/AVC reaches a very similar coding time as AV1, and H.265/HEVC achieves about two times longer computational times compared to AV1. The longest coding time, as expected, is reached by the H.266/VVC codec, starting from 27 times longer at a bitrate of 1 Mbps under UHD resolution and ending at 174 times longer at a bitrate of 15 Mbps under FHD resolution when compared to the AV1 codec.

4. Conclusions

This paper deals with the compression performance of the latest and most used video codecs, namely H.266/VVC, AV1, H265/HEVC, and H.264/AVC. In our experiments, we worked with 63 test sequences from three different databases with large diversity. Firstly, we had to choose some of the sequences according to SI-TI analysis. We have selected seven sequences, one sequence from each of the four corners and three from the middle of the SI-TI plot. Subsequently, we encoded them to particular codecs, namely H.264/AVC, H.265/GEVC, H.266/VVC, and AV1 at Full HD (FHD), Ultra HD (UHD), and 8K resolution. The bitrates were set to 1, 3, 5, 7, 10, and 15 Mbps for FHD and UHD resolutions and 5, 7, 10, 15, 30, and 50 Mbps for 8K resolution, respectively. Altogether, 420 test sequences were encoded. For the encoding, we decided to use the average bitrate coding (ABR) mode and one-pass rate control mode, except for when encoding sequences to the AVC and HEVC codecs at 8K resolution, where the two-pass encoding mode was used. For quality assessment, we used objective metrics, namely peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and video multi-method assessment fusion (VMAF), which belong to full-reference methods. In terms of the Bjøntegaard delta (BD) model, the results showed that H.266/VVC outperforms all other codecs, namely H.264/AVC, H.265/HEVC, and AV1, respectively. The averaged bitrate savings were approximately 78% for H.266/VVC, 63% for AV1, and 53% for H.265/HEVC relative to H.264/AVC, 59% for H.266/VVC and 22% for AV1 compared to H.264/AVC, and 46% for H.266/VVC relative to AV1, with 8K resolution for all. The results also varied depending on the test sequence and resolution used—with higher resolution, the effectiveness of newly developed codecs, such as H.266/VVC and AV1, was greater. This confirmed the fact that the H.266/VVC and AV1 codecs have been primarily developed for videos at high resolutions, such as 8K and/or UHD. In addition to bitrate savings, we also compared the processing time of all codecs at all resolutions. To carry out the analysis, we selected the “NeptuneFountain3” test sequence, which is positioned in the middle of the SI-TI plot. Our findings revealed that H.264/AVC has a similar coding time to AV1, whereas H.265/HEVC takes about twice as long to compute as AV1. As expected, the H.266/VVC codec had the longest coding time, starting at 27 times longer at a bitrate of 1 Mbps under UHD resolution and ending at 174 times longer at a bitrate of 15 Mbps under FHD resolution when compared to the AV1 codec.

In the near future, we would like to assess other test sequences from 8K datasets using objective metrics. Subsequently, we plan to select appropriate sequences and evaluate them using subjective methods, such as absolute category rating (ACR), absolute category rating with hidden reference (ACR-HR), or the double stimulus impairment scale (DSIS). From the results, we intend to calculate the correlation between the objective and subjective results by using Pearson and Spearman’s correlation coefficients. Moreover, the results will be used as inputs to neural networks to refine our proposed model, which can predict quality based on objective metrics.

Author Contributions

Conceptualization, M.U. and L.S. (Lukas Sevcik); Methodology, M.U., L.S. (Lukas Sevcik), J.B. and L.S. (Lenka Smatanova); Software, M.U., L.S. (Lukas Sevcik), J.B. and L.S. (Lenka Smatanova); Resources, M.U.; Writing–original draft, M.U. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Slovak VEGA grant agency, Project No. 1/0588/22 “Research of a location-aware system for the achievement of QoE in 5G and B5G networks”.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

ISO. ISO/IEC JTC 1/SC 29 — Coding of Audio, Picture, Multimedia and Hypermedia Information. Available online: https://www.iso.org/committee/45316.html (accessed on 27 February 2024).
Bross, B.; Wang, Y.K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.R. Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar] [CrossRef]
Han, J.; Li, B.; Mukherjee, D.; Chiang, C.H.; Grange, A.; Chen, C.; Su, H.; Parker, S.; Deng, S.; Joshi, U.; et al. A Technical Overview of AV1. Proc. IEEE 2021, 109, 1435–1462. [Google Scholar] [CrossRef]
Alliance for Open Media. Alliance for Open Media. Available online: https://aomedia.org (accessed on 27 February 2024).
Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
Wiegand, T.; Sullivan, G.; Bjontegaard, G.; Luthra, A. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
Siqueira, I.; Correa, G.; Grellert, M. Rate-Distortion and Complexity Comparison of HEVC and VVC Video Encoders. In Proceedings of the 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS), San Jose, Costa Rica, 25–28 February 2020. [Google Scholar] [CrossRef]
Martínez-Rach, M.O.; Migallón, H.; López-Granado, O.; Galiano, V.; Malumbres, M.P. Performance Overview of the Latest Video Coding Proposals: HEVC, JEM and VVC. J. Imaging 2021, 7, 39. [Google Scholar] [CrossRef] [PubMed]
Bouaafia, S.; Khemiri, R.; Sayadi, F.E. Rate-Distortion Performance Comparison: VVC vs. HEVC. In Proceedings of the 2021 18th International Multi-Conference on Systems, Signals & Devices (SSD), Monastir, Tunisia, 22–25 March 2021. [Google Scholar] [CrossRef]
Mercat, A.; Makinen, A.; Sainio, J.; Lemmetti, A.; Viitanen, M.; Vanne, J. Comparative Rate-Distortion-Complexity Analysis of VVC and HEVC Video Codecs. IEEE Access 2021, 9, 67813–67828. [Google Scholar] [CrossRef]
García-Lucas, D.; Cebrián-Márquez, G.; Cuenca, P. Rate-distortion/complexity analysis of HEVC, VVC and AV1 video codecs. Multimed. Tools Appl. 2020, 79, 29621–29638. [Google Scholar] [CrossRef]
Topiwala, P.; Krishnan, M.; Dai, W. Performance comparison of VVC, AV1 and EVC. In Proceedings of the Applications of Digital Image Processing XLII; Tescher, A.G., Ebrahimi, T., Eds.; SPIE: Bellingham, WA, USA, 2019. [Google Scholar] [CrossRef]
Nguyen, T.; Wieckowski, A.; Bross, B.; Marpe, D. Objective Evaluation of the Practical Video Encoders VVenC, x265, and aomenc AV1. In Proceedings of the 2021 Picture Coding Symposium (PCS), Bristol, UK, 29 June–2 July 2021. [Google Scholar] [CrossRef]
Nguyen, T.; Marpe, D. Compression efficiency analysis of AV1, VVC, and HEVC for random access applications. APSIPA Trans. Signal Inf. Process. 2021, 10, e11. [Google Scholar] [CrossRef]
Petreski, D.; Kartalov, T. Next Generation Video Compression Standards—Performance Overview. In Proceedings of the 2023 30th International Conference on Systems, Signals and Image Processing (IWSSIP), Ohrid, North Macedonia, 27–29 June 2023. [Google Scholar] [CrossRef]
Mansri, I.; Doghmane, N.; Kouadria, N.; Harize, S.; Bekhouch, A. Comparative Evaluation of VVC, HEVC, H.264, AV1, and VP9 Encoders for Low-Delay Video Applications. In Proceedings of the 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), Valencia, Spain, 19–22 October 2020. [Google Scholar] [CrossRef]
Valiandi, I.; Panayides, A.S.; Kyriacou, E.; Pattichis, C.S.; Pattichis, M.S. A Comparative Performance Assessment of Different Video Codecs. In Lecture Notes in Computer Science; Springer Nature: Cham, Switzerland, 2023; pp. 265–275. [Google Scholar] [CrossRef]
Esakki, G.; Panayides, A.; Teeparthi, S.; Pattichis, M. A comparative performance evaluation of VP9, x265, SVT-AV1, VVC codecs leveraging the VMAF perceptual quality metric. In Proceedings of the Applications of Digital Image Processing XLIII; Tescher, A.G., Ebrahimi, T., Eds.; SPIE: Bellingham, WA, USA, 2020. [Google Scholar] [CrossRef]
Nguyen, T.; Marpe, D. Future Video Coding Technologies: A Performance Evaluation of AV1, JEM, VP9, and HM. In Proceedings of the 2018 Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018. [Google Scholar] [CrossRef]
Zhao, X.; Liu, S.; Zhao, L.; Xu, X.; Zhu, B.; Li, X. A comparative study of HEVC, VVC, VP9, AV1 and AVS3 video codecs. In Proceedings of the Applications of Digital Image Processing XLIII; Tescher, A.G., Ebrahimi, T., Eds.; SPIE: Bellingham, WA, USA, 2020. [Google Scholar] [CrossRef]
Pourazad, M.T.; Sung, T.; Hu, H.; Wang, S.; Tohidypour, H.R.; Wang, Y.; Nasiopoulos, P.; Leung, V.C. Comparison of Emerging Video Compression Schemes for Efficient Transmission of 4K and 8K HDR Video. In Proceedings of the 2021 IEEE International Mediterranean Conference on Communications and Networking (MeditCom), Athens, Greece, 7–10 September 2021. [Google Scholar] [CrossRef]
Grois, D.; Giladi, A.; Choi, K.; Park, M.W.; Piao, Y.; Park, M.; Choi, K.P. Performance Comparison of Emerging EVC and VVC Video Coding Standards with HEVC and AV1. Smpte Motion Imaging J. 2020, 130, 1–12. [Google Scholar] [CrossRef]
FraunhoferHHI. 8K Berlin Test Sequences. Available online: https://www.hhi.fraunhofer.de/en/departments/vca/research-groups/video-coding-systems/8k-sequences.html (accessed on 27 February 2024).
Al Shoura, T.; Dehaghi, A.M.; Razavi, R.; Far, B.; Moshirpour, M. SEPE Dataset: 8K Video Sequences and Images for Analysis and Development. In Proceedings of the 14th Conference on ACM Multimedia Systems, ACM, 2023, MMSys’23, Vancouver, BC, Canada, 7–10 June 2023. [Google Scholar] [CrossRef]
Gao, W.; Yuan, H.; Liao, G.; Guo, Z.; Chen, J. PP8K: A New Dataset for 8K UHD Video Compression and Processing. IEEE MultiMedia 2023, 30, 100–109. [Google Scholar] [CrossRef]
ITU-T. Recommendation ITU-T P.910—Subjective Video Quality Assessment Methods for Multimedia Applications. 2023. Available online: https://www.itu.int/rec/T-REC-P.910-202310-I/en (accessed on 27 February 2024).
FFmpeg. A Complete, Cross-Platform Solution to Record, Convert and Stream Audio and Video. Available online: https://www.ffmpeg.org (accessed on 27 February 2024).
Pierre Lebreton, TU Berlin. A Command-Line-Based Tool for Windows to Calculate Spatial Information (SI) and Temporal Information (TI) according to ITU-T P.910. Available online: https://vqeg.github.io/software-tools/quality%20analysis/siti/ (accessed on 27 February 2024).
FFmpeg. FFmpeg integration—Support VVC for FFmpeg. Available online: https://github.com/fraunhoferhhi/vvenc/wiki/FFmpeg-Integration (accessed on 27 February 2024).
Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Liu, T.J.; Lin, Y.C.; Lin, W.; Kuo, C.C.J. Visual quality assessment: Recent developments, coding applications and future trends. APSIPA Trans. Signal Inf. Process. 2013, 2, e4. [Google Scholar] [CrossRef]
Fifonik. FFMetrics ver. 1.3.1—Yet Another Program for Video Visual Quality Metrics visualization. Available online: https://github.com/fifonik/FFMetrics (accessed on 27 February 2024).
Bjøntegaard, G. Calculation of Average PSNR Differences between RD-Curves. 2001. Available online: https://cir.nii.ac.jp/crid/1571980074917801984 (accessed on 27 February 2024).

Figure 1. SI-TI diagrams for the entire dataset for: (a) 8K resolution. (b) UHD resolution. (c) FHD resolution.

Figure 2. Previews of used sequences.

Figure 3. A flowchart highlighting the encoding and evaluating process. The mark * means all files (sequences) end with YUV extension, i.e. they are in the uncompressed format.

Figure 4. SSIM RD plots at 8K resolution.

Figure 5. SSIM RD plots at UHD resolution.

Figure 6. SSIM RD plots at FHD resolution.

Figure 7. VMAF RD plots at 8K resolution.

Figure 8. VMAF RD plots at UHD resolution.

Figure 9. VMAF RD plots at FHD resolution.

Figure 10. Averaged BD-BR savings, depending on codec and resolution.

Figure 11. (a–d) Coding structure of particular encoders.

Table 1. Dataset descriptions.

Dataset	Resolution	Color Space	Bitdepth	Framerate	Number of Frames
Fraunhofer	7680 × 4320p	ITU-R BT.2020 (YCbCr 4:2:0)	10 bit	60	600
PP8K	7680 × 4320p	ITU-R BT.2020 (YCbCr 4:2:0)	10 bit	60	600
SEPE	8192 × 4320p	ITU-R BT.2020 (RGBA 4:2:0)	8 bit	29.97	300

Table 2. Test sequence characteristics.

Dataset	Test Sequence	Description	8K		UHD		FHD
Dataset	Test Sequence	Description	SI Value	TI Value	SI Value	TI Value	SI Value	TI Value
Fraunhofer	BodeMuseum	A view of the Bode Museum in Berlin. Only a train and a boat move in the background. The camera is fixed.	50.83	6.78	71.77	6.02	88.57	4.39
	NeptuneFountain3	A view of the Neptune Fountain in Berlin. Water splashes from the fountain. The camera moves around the fountain from left to right.	26.55	21.23	39.04	21.16	55.23	20.75
	TiergartenParkway	A view of a parkway close to the Tiergarten in Berlin. The camera moves forward as if a person is walking with a camera.	54.13	33.97	79.56	33.91	101.25	32.98
PP8K	Cooking	A chef is cooking, and the flame is moving in the pan. The camera is fixed.	11.60	38.42	13.98	38.36	23.65	37.93
	Giraffe	Three giraffes are walking in the zoo. The camera is fixed.	34.62	8.92	52.70	8.07	63.72	7.47
	Koi	The colorful koi are swimming in the fish tank. A lot of bubbles produced by the working oxygen generator are floating upward in the water. The camera moves slowly from left to right.	12.91	7.70	12.70	6.55	18.84	5.89
SEPE	36	Many people are skating on the ice rink. The camera is fixed.	36.89	14.53	59.42	14.32	78.11	13.45

Table 3. Encoding parameters.

Codec	Resolution	Bitdepth	Bitrate [Mbps]
H.264/AVC, H.265/HEVC, H.266/VVC, AV1	FHD, UHD	8 bit	1, 3, 5, 7, 10, 15
H.264/AVC, H.265/HEVC, H.266/VVC, AV1	8K	10 bit	5, 7, 10, 15, 30, 50

Table 4. Command line settings used for encoding.

Codec	Resolution	Parameter	Command-Line Setting
H.264/AVC	FHD, UHD	1-pass	ffmpeg -i {input_sequence} -c:v h264 -b:v {bitrate} -g 60 {output_sequence}
H.264/AVC	8K	2-pass	ffmpeg -y -i {input_sequence} -c:v h264 -b:v {bitrate} -g 60 -pass1 -f null /dev/null && ffmpeg -i {input_sequence} -c:v h264 -b:v {bitrate} -g 60 {output_sequence} -pass2 {output_sequence}
H.265/HEVC	FHD, UHD	1-pass	ffmpeg -i {input_sequence} -c:v hevc -b:v {bitrate} -g 60 {output_sequence}
H.265/HEVC	8K	2-pass	ffmpeg -y -i {input_sequence} -c:v hevc -b:v {bitrate} -g 60 -x265-params pass=1 -f null /dev/null && ffmpeg -i {input_sequence} -c:v hevc -b:v {bitrate} -g 60 {output_sequence} -x265-params pass=2 {output_sequence}
H.266/VVC	FHD, UHD	1-pass	ffmpeg -i {input_sequence} -c:v vvc -b:v {bitrate} -g 60 -bit depth 8 {output_sequence}
H.266/VVC	8K	1-pass	ffmpeg -i {input_sequence} -c:v vvc -b:v {bitrate} -g 60 {output_sequence}
AV1	FHD, UHD, 8K	1-pass	ffmpeg -i {input_sequence} -c:v libsvtav1 -b:v {bitrate} -g 60 {output_sequence}

Table 5. BD-BR savings for particular test sequences at 8K resolution.

8K
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	−74.69%	−70.78%	−74.67%	−88.45%	−73.83%	−93.16%	−72.86%
H.266 vs. H.265	−56.33%	−40.82%	−58.44%	−76.36%	−56.38%	−77.58%	−51.24%
H.266 vs. AV1	−45.67%	−37.11%	−41.85%	−64.99%	−47.01%	−69.63%	−16.59%
H.265 vs. H.264	−44.17%	−52.60%	−45.77%	−66.02%	−44.25%	−76.96%	−42.91%
AV1 vs. H.264	−53.05%	−54.18%	−60.75%	−75.71%	−50.34%	−81.25%	−66.80%
AV1 vs. H.265	−17.29%	−7.17%	−28.74%	−31.43%	−14.32%	−15.07%	−41.48%

Table 6. BD-PSNR savings for particular test sequences at 8K resolution.

8K
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	3.39 dB	4.48 dB	5.57 dB	5.01 dB	2.67 dB	2.69 dB	3.03 dB
H.266 vs. H.265	1.78 dB	1.59 dB	2.69 dB	1.24 dB	1.38 dB	0.76 dB	1.54 dB
H.266 vs. AV1	1.31 dB	1.30 dB	1.66 dB	0.74 dB	1.06 dB	0.61 dB	0.36 dB
H.265 vs. H.264	1.61 dB	2.89 dB	2.88 dB	3.77 dB	1.29 dB	1.93 dB	1.49 dB
AV1 vs. H.264	2.08 dB	3.18 dB	3.91 dB	4.27 dB	1.61 dB	2.08 dB	2.67 dB
AV1 vs. H.265	0.46 dB	0.29 dB	1.03 dB	0.50 dB	0.32 dB	0.15 dB	1.17 dB

Table 7. BD-BR savings for particular test sequences at UHD resolution.

UHD
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	−70.53%	−77.96%	−77.21%	−78.85%	−68.53%	−86.77%	−68.17%
H.266 vs. H.265	−54.50%	−50.66%	−63.13%	−50.82%	−57.31%	−66.59%	−47.54%
H.266 vs. AV1	−39.09%	−44.94%	−39.82%	−39.39%	−39.61%	−63.41%	−24.02%
H.265 vs. H.264	−26.75%	−55.62%	−31.39%	−61.41%	−28.22%	−65.76%	−41.56%
AV1 vs. H.264	−53.74%	−56.06%	−66.69%	−70.03%	−50.40%	−71.24%	−60.00%
AV1 vs. H.265	−28.78%	−9.03%	−44.57%	−22.71%	−31.78%	−17.16%	−31.72%

Table 8. BD-PSNR savings for particular test sequences at UHD resolution.

UHD
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	4.49 dB	5.14 dB	5.40 dB	7.99 dB	3.82 dB	4.92 dB	4.24 dB
H.266 vs. H.265	3.15 dB	2.06 dB	3.88 dB	2.45 dB	2.73 dB	1.65 dB	2.24 dB
H.266 vs. AV1	1.62 dB	1.37 dB	1.61 dB	1.51 dB	1.53 dB	1.32 dB	0.83 dB
H.265 vs. H.264	1.34 dB	3.09 dB	1.53 dB	5.54 dB	1.10 dB	3.27 dB	2.00 dB
AV1 vs. H.264	2.87 dB	3.77 dB	3.79 dB	6.49 dB	2.29 dB	3.59 dB	3.41 dB
AV1 vs. H.265	1.53 dB	0.68 dB	2.27 dB	0.95 dB	1.19 dB	0.32 dB	1.41 dB

Table 9. BD-BR savings for particular test sequences at FHD resolution.

FHD
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	−61.00%	−60.93%	−62.18%	−67.69%	−59.27%	−70.84%	−60.70%
H.266 vs. H.265	−51.81%	−44.63%	−56.96%	−38.93%	−54.16%	−55.05%	−52.52%
H.266 vs. AV1	−29.26%	−42.53%	−41.35%	−39.68%	−36.14%	−52.45%	−28.48%
H.265 vs. H.264	−18.75%	−30.26%	−14.31%	−49.81%	−14.92%	−39.09%	−19.01%
AV1 vs. H.264	−44.75%	−33.84%	−38.98%	−52.41%	−38.07%	−45.38%	−48.06%
AV1 vs. H.265	−31.35%	−5.11%	−29.25%	−3.00%	−28.16%	−9.99%	−33.61%

Table 10. BD-PSNR savings for particular test sequences at FHD resolution.

FHD
	BodeMuseum	NeptuneFountain3	TiergartenParkway	Cooking	Giraffe	Koi	36
H.266 vs. H.264	3.50 dB	2.73 dB	3.72 dB	5.07 dB	3.60 dB	3.20 dB	3.21 dB
H.266 vs. H.265	2.69 dB	1.69 dB	3.13 dB	1.88 dB	2.96 dB	1.85 dB	2.43 dB
H.266 vs. AV1	1.19 dB	1.50 dB	1.84 dB	1.84 dB	1.78 dB	1.62 dB	1.22 dB
H.265 vs. H.264	0.81 dB	1.05 dB	0.58 dB	3.19 dB	0.64 dB	1.35 dB	0.78 dB
AV1 vs. H.264	2.31 dB	1.23 dB	1.88 dB	3.23 dB	1.82 dB	1.59 dB	1.99 dB
AV1 vs. H.265	1.50 dB	0.18 dB	1.30 dB	0.04 dB	1.18 dB	0.23 dB	1.21 dB

Table 11. Averaged BD-BR savings depending on codec and resolution.

	FHD	UHD	8K
H.266 vs. H.264	−63.23%	−75.43%	−78.35%
H.266 vs. H.265	−50.58%	−55.79%	−59.59%
H.266 vs. AV1	−38.56%	−41.47%	−46.12%
H.265 vs. H.264	−26.59%	−44.39%	−53.24%
AV1 vs. H.264	−43.07%	−61.17%	−63.16%
AV1 vs. H.265	−20.07%	−26.53%	−22.21%

Table 12. Averaged BD-PSNR savings depending on codec and resolution.

	FHD	UHD	8K
H.266 vs. H.264	3.58 dB	5.14 dB	3.83 dB
H.266 vs. H.265	2.37 dB	2.59 dB	1.57 dB
H.266 vs. AV1	1.57 dB	1.40 dB	1.01 dB
H.265 vs. H.264	1.20 dB	2.55 dB	2.27 dB
AV1 vs. H.264	2.01 dB	3.75 dB	2.83 dB
AV1 vs. H.265	0.81 dB	1.19 dB	0.56 dB

Table 13. Testing PC description.

Processor	AMD Ryzen 9 5950X 16-Core 4000.0 MHz
SSD	Samsung SSD 980 Pro 1TB
RAM	64 GB DDR4 SDRAM
Graphic card	NVIDIA GeForce RTX 3080 Founders Edition LHR
Operating system	Microsoft Windows 10 Education 64-bit

Table 14. The coding time for the “NeptuneFountain3” test sequence at all resolutions.

FHD
Codec/Bitrate	1 Mbps	3 Mbps	5 Mbps	7 Mbps	10 Mbps	15 Mbps
H.264/AVC	3.933 s	4.284 s	4.403 s	4.388 s	4.446 s	4.813 s
H.265/HEVC	6.284 s	7.462 s	8.457 s	9.263 s	10.042 s	11.245 s
H.266/VVC	185.142 s	328.863 s	432.875 s	514.937 s	625.180 s	765.770 s
AV1	3.079 s	3.572 s	3.676 s	3.802 s	3.952 s	4.387 s
UHD
Codec/Bitrate	1 Mbps	3 Mbps	5 Mbps	7 Mbps	10 Mbps	15 Mbps
H.264/AVC	15.719 s	16.177 s	16.424 s	16.578 s	16.864 s	17.218 s
H.265/HEVC	23.463 s	27.036 s	34.500 s	35.183 s	36.568 s	32.863 s
H.266/VVC	312.819 s	519.568 s	674.671 s	807.677 s	973.196 s	1229.545 s
AV1	11.467 s	11.826 s	12.355 s	12.664 s	12.889 s	13.282 s
8K
Codec/Bitrate	5 Mbps	7 Mbps	10 Mbps	15 Mbps	30 Mbps	50 Mbps
H.264/AVC	48.772 s	49.057 s	48.811 s	49.487 s	49.916 s	49.632 s
H.265/HEVC	96.779 s	102.943 s	157.587 s	168.960 s	183.300 s	243.093 s
H.266/VVC	1693.086 s	1921.788 s	2271.042 s	2815.820 s	4286.146 s	6162.371 s
AV1	39.473 s	39.268 s	40.079 s	41.569 s	43.875 s	52.724 s

Table 15. A comparison of coding times for the “NeptuneFountain3” test sequence at all resolutions relative to the AV1 codec.

FHD
Codec/Bitrate	1 Mbps	3 Mbps	5 Mbps	7 Mbps	10 Mbps	15 Mbps
H.264/AVC	1.28×	1.20×	1.20×	1.15×	1.13×	1.10×
H.265/HEVC	2.04×	2.09×	2.30×	2.44×	2.54×	2.56×
H.266/VVC	60.13×	92.07×	117.76×	135.44×	158.19×	174.55×
AV1	1.00×	1.00×	1.00×	1.00×	1.00×	1.00×
UHD
Codec/Bitrate	1 Mbps	3 Mbps	5 Mbps	7 Mbps	10 Mbps	15 Mbps
H.264/AVC	1.37×	1.37×	1.33×	1.31×	1.31×	1.30×
H.265/HEVC	2.05×	2.29×	2.79×	2.78×	2.84×	2.47×
H.266/VVC	27.28×	43.93×	54.61×	63.78×	75.51×	92.57×
AV1	1.00×	1.00×	1.00×	1.00×	1.00×	1.00×
8K
Codec/Bitrate	5 Mbps	7 Mbps	10 Mbps	15 Mbps	30 Mbps	50 Mbps
H.264/AVC	1.24×	1.25×	1.22×	1.19×	1.14×	0.94×
H.265/HEVC	2.45×	2.62×	3.93×	4.06×	4.18×	4.61×
H.266/VVC	42.89×	48.94×	56.66×	67.74×	97.69×	116.88×
AV1	1.00×	1.00×	1.00×	1.00×	1.00×	1.00×

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Uhrina, M.; Sevcik, L.; Bienik, J.; Smatanova, L. Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions. Electronics 2024, 13, 953. https://doi.org/10.3390/electronics13050953

AMA Style

Uhrina M, Sevcik L, Bienik J, Smatanova L. Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions. Electronics. 2024; 13(5):953. https://doi.org/10.3390/electronics13050953

Chicago/Turabian Style

Uhrina, Miroslav, Lukas Sevcik, Juraj Bienik, and Lenka Smatanova. 2024. "Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions" Electronics 13, no. 5: 953. https://doi.org/10.3390/electronics13050953

APA Style

Uhrina, M., Sevcik, L., Bienik, J., & Smatanova, L. (2024). Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions. Electronics, 13(5), 953. https://doi.org/10.3390/electronics13050953

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Performance Comparison of VVC, AV1, HEVC, and AVC for High Resolutions

Abstract

1. Introduction

2. Experiment Setup

2.1. Dataset Description

2.2. Video Encoding

2.3. Objective Quality Evaluation

3. Analysis of the Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI