Next Article in Journal
Mapping Research Trends with the CoLiRa Framework: A Computational Review of Semantic Enrichment of Tabular Data
Previous Article in Journal
HFI-Former: High-Frequency Interaction Transformer for Robust Scene Text Detection
Previous Article in Special Issue
RNN-Based F0 Estimation Method with Attention Mechanism
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems

Institute of Multimedia Telecommunications, Poznań University of Technology, 61-131 Poznań, Poland
*
Author to whom correspondence should be addressed.
Information 2026, 17(4), 366; https://doi.org/10.3390/info17040366
Submission received: 3 February 2026 / Revised: 29 March 2026 / Accepted: 31 March 2026 / Published: 13 April 2026
(This article belongs to the Special Issue Signal Processing and Machine Learning, 2nd Edition)

Abstract

Modern video compression is implemented in complex software systems that reuse software modules from various sources. This is particularly evident in experimental software systems designed for researching and standardizing new compression technologies. These systems often incorporate software modules operating in different color spaces. For example, AI-based techniques are often used in video coding experiments. The corresponding software modules often operate on RGB representations, while other modules operate on YCBCR components. In this study, we demonstrate that the quality loss resulting from color transformations is comparable to the respective quantization noise. Consecutive cycles of color transformations do not result in significant additional degradation. However, for image compression, very different results are obtained in different color representations. This aspect must be carefully considered in compression research. This paper supports these considerations with extensive experimental results in the context of ITU Recommendations BT.709 and BT.2020, as well as AVC and HEVC compression.

1. Introduction

Visual data is expressed in various color spaces, which are chosen according to the field of application and the respective traditions of the community. The multitude of color coordinate systems constitutes a field of study in itself [1,2,3,4,5,6]. Individual color coordinate systems have been designed and developed for various applications. For instance, RGB representations are commonly used in cameras and displays, whereas YCBCR color representations, i.e., luma and chroma, are commonly used for transmission and compression. The problem becomes even more complex when considering the numerous variants of RGB and YCBCR color spaces corresponding to various technological stages of development [1,2,3,4,5,6,7].
Recently, the clear distinction between the application areas of RGB and YCBCR has become blurred (e.g., [8]). However, this issue is often not carefully considered. It does not cause problems when the entire processing path is based on a single color space. When video processing uses different color representations, potential quality losses must be considered. However, color spaces are often treated as a secondary topic even among practicing experts, which means their impact may not always receive sufficient attention.
This aforementioned observation is also related to research on image and video compression. The problem has become even more pronounced due to the use of various software programs that operate on video represented in different color spaces. Often, the machine vision and artificial intelligence software modules operate in the RGB color space, whereas the classic video coding modules operate in the YCBCR transmission color space [9,10,11]. This practical approach stems from the availability of various software modules and is often used by video expert groups when developing test model software (e.g., [12,13,14,15]). This modus operandi requires:
  • Attention to the variants of RGB and YCBCR color spaces, versions of opto-electric and electro-optical transfer functions, and the use of proper transformations between the selected versions of RGB and YCBCR color spaces (e.g., [7,16,17,18]);
  • Attention to sample encoding range (e.g., [7]);
  • Video quality losses due to color transformations need to be taken into account.
This paper focuses primarily on the third problem. Its relevance has recently increased due to the development of hybrid video compression methods that combine classic compression techniques with artificial intelligence (AI)-based methods [19,20,21,22,23]. The research, testing, and implementation of these methods are inevitably related to software, in which modules operating on RGB representations are linked to modules operating on YCBCR representations.
The aforementioned problem is important for evaluating the results of video compression. This task is especially important in the research of improved compression methods coordinated by international expert groups such as the Joint Video Experts Team (JVET), the Moving Picture Experts Group (MPEG), the Video Coding Experts Group (VCEG), and the Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI). These groups primarily use ablation studies to test newly proposed compression tools. They use software called a “test model” that corresponds to the current state of research. First, a new tool is tested individually by embedding it into the test model. The compression performance of the extended test model is assessed by measuring the bitrate change versus the quality change in the decoded video. For most of the tested tools, the quality change perceived by humans is so small that a reliable MOS (Mean Opinion Score) estimation is difficult [24]. Considering other issues with subjective quality assessment, such as complexity, slowness, and cost, a practical approach is to use simple quality metrics, even though they poorly approximate subjective quality assessment by humans. These metrics include PSNR [24], SSIM [25], VQM [26], and VMAF [27]. Due to its simplicity, PSNR (Peak Signal-to-Noise Ratio) is the most frequently used and reported metric. Therefore, it is also used in this paper.
Another important issue is that solutions that differ by small margins in MOS or PSNR are often compared. Consequently, relatively minor quality losses due to color transformations can significantly affect the results of comparisons and the conclusions drawn from experiments. This paper discusses errors resulting from color transformations to address this issue.
In video compression research, test images and video clips are used for experimentation. Expert groups carefully select these test pictures from high-quality images and video clips of diverse content types. Even these high-quality pictures contain noise generated during the photoelectric transformation or quantization processes [7,28]. It has been found that a certain level of noise may be important for viewers to perceive good picture quality (e.g., [29,30]).
High-quality test images and video clips with a low level of noise are used as references in compression experiments, particularly in image and video quality assessments.

2. State-of-the-Art

Color transformations from RGB to transmission color coordinates (i.e., luma and chroma) partially decorrelate the RGB components (e.g., [7]). These color transformations improve compression efficiency and are standard engineering practice for color image and video transmission and storage. Rather than coding images in RGB, compression is implemented in the transmission YCBCR representation, i.e., using luma Y, chroma CB, and chroma CR [7,16,17,18]. The transformations in the decoder (YCBCR → RGB) must be the inverse of those in the encoder (RGB → YCBCR). Therefore, these transformations are carefully standardized for communication applications. There exist several standards that define RGB → YCBCR and YCBCR → RGB transformations. For digital video, it is common to use international standards defined for digital television, i.e., standards for standard-definition (SD) television [16], high-definition (HD) television [17], and ultra-high-definition (UHD) television [18]. The latter is also designed for High Dynamic Range and Wide Color Gamut video [6].
The aforementioned color transformations are linear in the sense that the output sample values are calculated as weighted sums of the input sample values [7]. Nevertheless, both the input and output samples have relatively short binary representations: 8 bits for standard dynamic range and 10 to 12 bits for high dynamic range. Good engineering practice dictates using a longer sample representation for internal processing and rounding the sample values before transmission or storage. This also applies to color transformations. This results in color transformations causing additional errors due to rounding sums of products. The properties of rounding noise are discussed elsewhere and will not be covered in detail here. It is sufficient to note that rounding noise exhibits uniform probability distribution [28]. Additional distortion caused by color transformations includes clipping sample values at the edges of the sample value interval, but this phenomenon usually has less influence on video quality.
Therefore, it should be noted that errors are introduced into the sample values during a full cycle of color transformation (RGB → YCBCR → RGB). Similar effects are observed for various RGB and YCBCR color space variants [7].
The impact of color transformations on image quality has been extensively researched in the context of lossless and nearly lossless image coding.
Special reversible color transformations have been proposed for lossless image compression [31,32,33,34]. However, such transformations are not currently used for practical video processing and coding. Furthermore, the conditions for a reversible color transformation have been outlined in [34,35].
Some special transformations have also been proposed for nearly lossless image compression [35]. Nevertheless, the standard color transformations RGB → YCBCR and YCBCR → RGB, like in [16,17], have been studied for nearly lossless image compression as well [36,37,38,39,40].
In [36,37,40], the problem of errors in multiple cycles of color transformation and compression was considered theoretically and experimentally. It was shown that rounding errors do not accumulate significantly in consecutive cycles of color transformations defined by ITU Rec. BT.601 [16]. Furthermore, even consecutive cycles consisting of RGB to YCBCR transformation, lossy JPEG or JPEG 2000 compression and decompression, and YCBCR to RGB inverse transformation do not exhibit significant error accumulation, i.e., the change in RGB sample values in the second and third cycles is negligible [40]. Nowadays, the results for ITU Rec. BT.601 color transformation are less important.
For a long time, the problem of rounding errors due to color transformations was not an important topic for lossy image and video compression because YCBCR representations are mostly produced in cameras, and RGB representations are retrieved in displays after decompression. However, the situation changed quite recently, perhaps after 2020. In particular, video compression research and development uses both classic compression software that processes YCBCR representations and artificial intelligence software that predominantly works on RGB representations. These software modules are usually very complex; therefore, in research, they are used as they are. However, in future commercial applications, they will likely be optimized. Thus, the question arises as to the trustworthiness of research results obtained using test software that includes both RGB and YCBCR processing modules.

3. Color Transformations and Corresponding Rounding Errors

The chroma components CB and CR are typically downsampled by a factor of two in both horizontal and vertical directions. When the luma sampling density remains unchanged, this chroma subsampling scheme is referred to as 4:2:0 format. The rationale behind this chroma subsampling is that the chroma components occupy a significantly narrower band of spatial frequencies than the luma component [7]. Consequently, conversion to the 4:2:0 format is a minor source of error for Standard Definition (SD), High Definition (HD), and Ultra High Definition (UHD) television and movie content, respectively. 8
Moreover, in recent video compression research using mixed RGB and YCBCR representations, chroma subsampling is often not employed.
For this reason, considerations on chroma subsampling may be omitted from this paper for the sake of concision.
Additionally, it should be noted that the R, G, and B components are calculated after applying a nonlinear opto-electric transfer function, also known as an opto-electronic transfer function [3,4,5,6,7]. The standard notation uses primed variables (R’, G’, B’) to emphasize this, where the primes denote that the numerical values are obtained using a nonlinear opto-electric transfer function. Opto-electric transfer functions are standardized, and the reciprocal electro-optical transfer functions are used for displays. Good examples of opto-electric and electro-optical transfer functions are those defined in ITU Recommendations BT.601 [16], BT.709 [17] and BT.2020 [18]. In fact, these three functions are very similar to each other. The three recommendations differ primarily in their definitions of the primaries and the allowed signal intervals. Proper transfer functions must be used in a display to display colors properly.
Opto-electric transfer functions are mostly implemented in video cameras, whereas the reciprocal electro-optical transfer function is implemented in displays. Therefore the considerations of the opto-electric and electro-optical functions may be omitted from this paper.
Let consider the color transformation
Y = T X ,
where
X = [ R G B ] ,                   Y = [ Y C B C R ] ,
T = [ t 11 t 12 t 13 t 21 t 22 t 23 t 31 t 32 t 33 ] ,
R’, G’, B’ components are those after opto-electric transfer function or before the electro-optical transfer function is applied [3,4,7].
The inverse transform
X ^ = S Y = T 1 Y ,
where X ^ is an approximation of X ,
S = T 1 = [ s 11 s 12 s 13 s 21 s 22 s 23 s 31 s 32 s 33 ] .
Assume that the sample values of R’G’B’ and YCBCR are all integers. Since the matrix elements [ t i j ] and [ s i j ] are real, rounding is necessary after both color transformations.
Currently, we are assuming that the R’G’B’ and YCBCR sample value intervals are infinite. This theoretical assumption implies that clipping is unnecessary after rounding.
In [35,36,40], the sufficient condition for a reversible color transformation is provided
| s 11 | + | s 12 | + | s 13 | < 1 , | s 21 | + | s 22 | + | s 23 | < 1 , | s 31 | + | s 32 | + | s 33 | < 1 .
After a single cycle of transformations R’G’B’ → YCBCR → R’G’B’ the sample errors are bounded by
L R = r o u n d [ ( | s 11 | + | s 12 | + | s 13 | ) / 2 ] , L G = r o u n d [ ( | s 21 | + | s 22 | + | s 23 | ) / 2 ] , L B = r o u n d [ ( | s 31 | + | s 32 | + | s 33 | ) / 2 ] ,
where LR, LG, LB are the error bounds for values of R, G, B samples, respectively [37].
| R R ^ | L R ,   | G G ^ | L G ,   | B B ^ | L R B .
Furthermore, proofs are provided in [35,36,37,40] that the condition
| t 11 | + | t 12 | + | t 13 | 1 , | t 21 | + | t 22 | + | t 23 | 1 , | t 31 | + | t 32 | + | t 33 | 1 .
yields that after the first cycle of color transformations R’G’B’ → YCBCR → R’G’B’, rounding errors do not influence the output R’G’B’ sample values in the consecutive color transformation cycles.

4. Color Transformations According to Recommendation BT.709

Nowadays, the most common color transformation is probably that defined by ITU Recommendation BT.709 [17].
The R’, G’, B’ and Y, CB, CR component samples are encoded as unsigned integers with either 8 or 10 bits. The results concerning noise generated by color transformations are similar for 8-bit and 10-bit representations, though the level of noise is reduced by about 12 dB for 10-bit representations [28]. For the sake of simplicity, in this section, further considerations will focus on 8-bit sample representations, though the results can easily be adapted to 10-bit sample encoding. Similar reasoning for 10-bit representation is included in Section 5.
In 8-bit representations, the values CB = CR = 128 correspond to gray levels characterized by Y. The maximum values of CB or CR correspond to blue or red, respectively. The minimum values of CB and CR correspond to yellow and cyan, respectively. There are two versions of the Y, CB, and CR sample encoding range in use [7,17]:
  • “Full-Swing”: The R’, G’, B’, and Y, CB, CR samples are represented in the interval ‹0,255›. According to the notation in [7], we will use 255T and 255S for the color transformation matrices.
[ Y C B 128 C R 128 ] = T 255 [ R G B ] ,                 [ R G B ] = S 255 [ Y C B 128 C R 128 ] .
2.
“Television Range”: The 8-bit sample representations of R’, G’, B’ are in the interval ‹0,255›. Luma Y has representations in the interval ‹16,235› whereas chroma values are in the interval of ‹16, 240›. According to the notation in [7], we will use 219T and 219S for the color transformation matrices.
[ Y 16 C B 128 C R 128 ] = T 219 [ R G B ]                     [ R G B ] = S 219 [ Y 16 C B 128 C R 128 ] .
The transformation matrices are the following:
T 255 = [ 0.2126 0.7152 0.0722 0.1146 0.3854 0.5000 0.5000 0.4542 0.0458 ]   ,         S 255 = [ 1 0 1.5748 1 0.187324 0.468124 1 1.8556 0 ] ,
T 219 = 1 256 [ 46.742 157.243 15.874 25.765 86.674 112.439 112.439 102.129 10.310 ]   ,         S 219 = 1 256 [ 298.082 0 458.942 298.082 54.592 136.425 298.082 540.775 0 ] .
The properties of the R’G’B’ → YCBCR → R’G’B’ color transformations are listed in Table 1. These are theoretical properties, as they do not consider the effects of clipping at the edges of the intervals for sample values.
The properties listed in Table 1 for BT.709 are very similar to those of BT.601 [35,37,40].
Similar reasonings for the BT.709 YCBCR → R’G’B’ → YCBCR transformations and for all R’, G’, B’, and Y, CB, CR samples in the ‹0,255› interval (“full-swing”) lead to the conclusion that the transformation is borderline reversible.

5. Color Transformations According to Recommendation BT.2020

ITU Recommendation BT.2020 [18] defines the R’G’B’ and YCBCR color spaces according to the needs of modern Ultra High Definition Television (UHDTV), which provides a high dynamic range of luma and a wide color gamut [41]. Such systems require an extended range of sample values. ITU Recommendation BT.2020 provides definitions of primaries and color transformation formulas. This standard is intended for 10- and 12-bit representations, in which sample values are encoded as unsigned integers. For the sake of simplicity, this text focuses on 10-bit sample representations, though the results can easily be adapted to 12-bit sample encoding.
Similarly, as for 8-bit representations, two versions of the Y, CB, and CR sample encoding ranges are in use [7,17]:
  • “Full-Swing”: The R’, G’, B’, and Y, CB, CR samples are represented in the interval ‹0,1023›.
  • “Television Range”: The 10-bit sample representations of R’, G’, B’ are in the interval ‹0,1023›. Luma Y and chroma CB, CR signals have representations in the interval of ‹64, 960›.
For both sample encoding ranges and 10-bit sample representations, the values CB = CR = 512 correspond to gray levels characterized by Y values.
The respective color transformations use formulas similar to those in Section 4. For 10-bit representations, for instance, the transformation matrix 1023S that corresponds to 255S from Section 4.
S 1023 = [ 1 0 1.5746 1 0.16455 0.5713 1 1.8814 0 ]     .
Assuming theoretically that no clipping occurs at the edges of the sample value intervals, Equation (7) yields the bounds for sample errors LR = 1, LG = 1, LB = 1 when the full range of sample values, ‹0,1023›, is used, similar to the BT.709 transformation.

6. Experimental Estimation of Video Quality Deterioration Due to Color Transformations

The goal of these experiments is to understand color transformation errors in practical environments, such as when clipping occurs at the edges of the permitted intervals.
In these experiments, color transformations are implemented using floating-point arithmetic, and the results are rounded to 8-bit or 10-bit integers using the rule as follows:
r o u n d   ( x ) =   x + 0.5 ,
where y denotes the greatest integer not exceeding y . Because color transformations use floating-point arithmetic, the selection of the definition of rounding has negligible influence on the results. For multiplications of color components by a selected factor, the way of processing is the same. Video codecs already produce integer sample values with the requested number of bits in sample representations. Therefore, all input, output and intermediate data between color transformations, the encoder and decoder are integers.
To address the topic cross-sectionally, the experiments were conducted in two scenarios:
  • Scenario A: Color transformations according to ITU Recommendation BT.709 with 8-bit sample encoding. Experiments were conducted for still images and AVC compression when appropriate.
  • Scenario B: Color transformations according to ITU Recommendation BT.2020 with 10-bit sample encoding. Experiments were conducted for 64-frame video clips and HEVC compression when appropriate.

6.1. Scenario A of the Experiments

The typical approach to estimating color properties in video involves analyzing key frames [4,5,6,7,42,43]. To obtain representative results, images with diverse content and color characteristics should be used. One recognized image database is the Kodak Lossless True Color Image Suite, which provides images with 8-bit sample encoding in the R’G’B’ color space [44]. Thus, the experiments were conducted using colorful still images I01-I25 obtained from the TID2013 database [45,46]. The TID2013 database provides access to the Kodak Suite.
For this type of research, image size does not affect the results, provided the image is large enough. Thus, we accept moderate image sizes of 512 × 384 or 384 × 512, as with the aforementioned image database. Each image contains 196,608 samples of each component, which is a sufficient number to represent color transformation statistics. The rationale for using 4:4:4 data is explained at the beginning of Section 3.
The experiments were performed for the BT.709 color transformation [7,17]. Previously [36,37,40], the BT.601 transformations were investigated, and the results were very similar. Therefore, there is no need to explain those results here.
The AVC codec [47,48] was used in its FFmpeg implementation [49] and configured for Intra mode with 4:4:4 data (High 4:4:4 Profile [47]), i.e., no chroma subsampling was used. All input and output color component samples have 8-bit representations. In this FFmpeg codec the Constant Rate Factor is set to 25. The FFMPEG configuration was as follows:
  • ffmpeg.exe -y -pix_fmt yuv444p -s 512×384 -i input_file.yuv -c:v libx264 -profile:v high444 -crf 25 out.
One might also wonder why only Intra mode was configured for the video codec. This is due to the static nature of color transformations. Temporal relations are unimportant for the provided analysis. Thus, the experiments are simplified to still images and Intra mode. The simplicity of the experiments ensures high reproducibility of the results without compromising the strength of validation.
Since color transformations cause moderate and consistent quality degradation across various experiments, the peak signal-to-noise ratio (PSNR) can be used to evaluate image quality, as discussed in Section 1. Additionally, image and video processing and compression software primarily report quality based on PSNR. Therefore, this paper uses PSNR values to evaluate image quality.
Most of the numerical results presented for Scenario A in this section and in Appendix A, Appendix B, and Appendix C come from the JVET document [50]. The authors hold the copyright for this document.

6.2. Scenario B of the Experiments

In this scenario, the experiments are repeated but using test video sequences with 10-bit sample encoding and HEVC compression [51,52].
For the experiments, 8 test video clips with 10-bit representation of the R′, G’, and B′ components were selected from the set of test video sequences proposed by the Joint Collaborative Team on Video Coding (JCT-VC) for research on HEVC with range extensions [53,54]. For each video clip, 64 first frames are selected from the original sequences described in [53] and Table 2.
The HEVC codec [51,52] was used in its FFmpeg implementation [49] in the Main 4:4:4 10 profile and Main tier. All input and output color component samples have 10-bit representations. The Constant Rate Factor is set to 25 in the FFmpeg codec. The FFmpeg configuration was as follows:
  • ffmpeg.exe -y -pix_fmt yuv444p10le -s width × height -i input_file.yuv -c:v libx265 -crf 25.
The Group of Pictures (GOP) structure was as follows: IBBBBPBBBBPBBBBP…PBBP. It consisted of all 64 frames because we mainly tested predictive coding as intra-frame coding was tested in Scenario A.

6.3. One or Two Cycles of Color Transformation

Figure 1 depicts the scenario involving two R’G’B’ → YCBCR → R’G’B’ cycles. After each R’G’B’ → YCBCR or YCBCR → R’G’B’ transformation, the sample values are rounded to 8-bit integers in Scenario A, and to 10-bit integers for Scenario B.
The experiments in this section correspond to a situation in which a software module performs a color transformation without modifying a video frame or a part of a still image.
The PSNR values are calculated using the original input images as the references. Table 3 provides the average PSNR values after one or two cycles, and Table A1, Table A2, Table A3 and Table A4 in Appendix A show the detailed results for individual test images or test video sequences.
The average PSNR values (Table 3) are calculated for each component using either 4,915,200 or 1,191,116,800 samples, depending on the scenario. The large numbers of samples assure reliable estimation of averages. The PSNR values for the green component are 2–5 dB higher than those for the red or blue components.
The PSNR values after the first cycle are lower than the anticipated values for 8-bit quantization. For the second quantization cycle, however, the loss of PSNR is negligible, not exceeding 0.02 dB. This finding aligns with the transformation property stated in Section 4, Equation (7). As expected, the PSNR values for 10-bit representations are approximately 12 dB higher than the corresponding values for 8-bit representations. For the “television range,” however, this difference is slightly lower. For this sample coding interval, the PSNR values are approximately 0.8–2.0 dB lower than for full-range sample encoding for both scenarios.
For R′, G′, B′ components, for the entire sets of test images or video clips, the total histograms of errors are provided in Table 4. The histograms for individual test images and selected test video sequences are in Appendix A: Table A5 and Table A6.
The histograms align with the R′, G’, and B′ error limits in Table 1. Note that the theoretical error limits from Section 4 and Section 5 were obtained under the assumption that no clipping occurs for sample values. In real video clipping may occur but sample values rarely exhibit extreme values. In particular, chroma samples mostly describe pixels with low saturation. These features reduce the probability of sample value slipping. Thus, the theoretical limits from Table 1 often apply.

6.4. Two Transformation Cycles with Small Modifications of Video Data in Transmission Representation

In this section, we consider Scenario A only, i.e., 8-bit samples and color transformations according to ITU Recommendation BT.709. These experiments involve no compression.
The scenario is depicted in Figure 2, where two cycles R’G’B’ → YCBCR → R’G’B’ are shown. Unlike the previous scenario from Figure 1, the YCBCR values are modified and rounded in each cycle. These two modifications are the inverse arithmetic operations. After each R’G’B’ → YCBCR or YCBCR → R’G’B’ transformation the sample values are rounded to 8-bit integers.
This experiment demonstrates the changes in image quality due to modifications of intermediate sample values.
The detailed results for individual test images are shown in Table A7 and Table A8 in Appendix B.
The values of PSNR are lower by several decibels compared to the results from Section 6.3 because the sample values are modified by a factor of 0.9. In the second cycle, multiplying by the factor 1/0.9 does not result in higher PSNR values. In the second cycle, the PSNR values are lower than in the first cycle by less than 0.05 dB.

6.5. Two Transformation Cycles with AVC or HEVC Compression of Luma and Chromas

This experiment demonstrates the loss of compression performance due to additional color transformation.
The scenario considered is depicted in Figure 3 where a single cycle R’G’B’ → YCBCR → R’G’B’ is shown. Unlike the previous scenario from Figure 1, the YCBCR values are modified by encoding and decoding. After the R’G’B’ → YCBCR or YCBCR → R’G’B’ transformations, the sample values are rounded to 8-bit integers or to 10-bit integers depending on the scenario A and B, respectively.
For each image and for each color component, two different PSNR values are calculated, as depicted in Figure 3:
(1)
PSNR of the decoder output with the decoder input as the reference (PSNR Y,CB,CR),
(2)
PSNR of the output R’G’B’ image with the input R’G’B’ image as the reference (PSNR R,G,B).
The two groups of PSNR values represent compression efficiency for two color spaces.
This corresponds to a popular scenario in contemporary research. Often, a piece of compression software with input and output in YCBCR is embedded in artificial intelligence software or used to compress R’G’B’ test images. This example illustrates how compression efficiency is differently reported in various color spaces.
The average values of PSNR for R’G’B’ and YCBCR are provided in Table 5 whereas the detailed results for individual test images are shown in Table A9, Table A10, Table A11 and Table A12 in Appendix C.
Depending on the number of bits used to encode samples, the average PSNR for YCBCR is about 4–8 dB higher than the average PSNR for R’G’B’. This significant difference may be partially explained by the higher PSNR values typically observed for chroma. Nevertheless, this phenomenon must be carefully considered when reporting compression results in different color spaces.
The results for luma Y and green G’ are the most similar. This is because the green component contributes the most to the calculation of Y. Nevertheless, even for luma and green, the differences are about 1–3 dB. This is significant compared to the results reported for individual contributions to the prospective new compression techniques.
These results are also useful for other video codecs, such as VVC [55], because the main difference relates to other rate-distortion characteristics of these codecs.
A clear warning is that for a given video codec, its rate-distortion varies significantly depending on the color space used for reporting.

7. Conclusions

Final conclusions:
  • Color transformations result in relatively minor degradation, comparable to the rounding noise measured for the respective number of bits of sample representations. Color transformations in the full R’G’B’ YCBCR  R’G’B’ cycle generate noise with a PSNR value exceeding 50 dB for 8-bit representations (see Appendix A). For longer bit representations, the PSNR value increases by approximately 6 dB for each additional bit.
  • The quality losses may be comparable to the gains obtained using new compression tools as observed for the work in standardization bodies.
  • Additional color transformations require at least one “guard” bit if there is no reduction in the sample interval. Shrinking the sample value interval to the “television range” requires two additional “guard” bits.
  • Consecutive cycles of color transformation R’G’B’ YCBCR  R’G’B’ result in negligible additional distortion approximately 0.01 dB (see Appendix A). Deeper analysis [36,37,40] demonstrates that only a small percentage of samples (mostly below 5%) change their numerical values in consecutive cycles.
  • Operations performed on color component samples within a color transformation cycle may cause significant distortion (see Appendix B).
  • The PSNR RGB values are substantially lower than the PSNR YCBCR values measured for compression alone (see Appendix C). Numerous experiments in MPEG and JVET have shown that the PSNR for CB or CR are significantly higher (by several dB) than the PSNR for Y or PSNR RGB. Even comparing PSNR Y and PSNR G often demonstrates a loss of nearly 2 dB.
Research on new compression methods requires careful consideration of how color transformation influences compression efficiency results. This is critical today, as artificial intelligence and compression software are integrated into a single system. When researching new compression methods, it is important to carefully consider the aforementioned combination of methods and color representations to avoid inaccurate compression efficiency results.
The paper delivers a message to the video researchers: “Avoid unnecessary color transformations!”

Author Contributions

Conceptualization, M.D.; methodology, M.D., A.G. and O.S.; software, A.G. and O.S., validation, A.G.; writing, M.D. and A.G. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by the Ministry of Science and Higher Education of the Republic of Poland under SBAD action.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The experiments described in Section 6 were conducted using the AVC and HEVC codecs in their implementations known as libx264 and libx265 from https://ffmpeg.org accessed on 15 September 2025. The test images are from Tampere Image Database 2013 TID2013 [50,51], Version 1 available at https://www.ponomarenko.info/tid2013.htm (accessed on 15 September 2025). The test video clips can also be obtained through JVET on ftp://jvet@ftp.ient.rwth-aachen.de and ftp://jvet@ftp.hhi.fraunhofer.de.

Acknowledgments

The authors thank the Ministry of Science and Higher Education of the Republic of Poland for its support.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AIartificial intelligence
AVCAdvanced Video Coding
HDHigh Definition
HEVCHigh Efficiency Video Coding
IECInternational Electrotechnical Commission
IEEEInstitute of Electrical and Electronics Engineers
ISInternational Standard
ISOInternational Organization for Standardization
ITUInternational Telecommunication Union
JCT-VCJoint Collaborative Team on Video Coding
JPEGJoint Photographic Experts Group (an international group that develops standards for still image compression)
JPEGpopular name for the image compression standard ISO/IEC IS 10918-1, also ITU-T Rec. T.81 [56]
JPEG 2000popular name for the image compression standard ISO/IEC IS 15444-1, also ITU-T Rec. T.800 [57]
JVETJoint Video Experts Team (an international expert group that develops standards for video compression, a joint group of ITU-T SG21 WP3/21 and ISO/IEC JTC 1/SC 29, working for ISO, IEC, ITU)
MOSMean Opinion Score
MPEGMoving Picture Experts Group (international expert groups that develop standards for digital audio, video, and graphics compression, working for ISO and IEC)
MPAIMoving Picture, Audio and Data Coding by Artificial Intelligence (an international expert group that develops standards for video and audio compression using the AI methods, produces the IEEE standards)
Rec.Recommendation
RGBred—green—blue
SDStandard Definition
SSIMStructural Similarity Index Measure
UHDUltra High Definition
VCEG Video Coding Experts Group
VMAFVideo Multimethod Assessment Fusion
VQMVideo Quality Metric
VVCVersatile Video Coding
YCBCRluma—chroma CB—chroma CR

Appendix A

Table A1. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 [7,17] for the full range of luma and chroma in Scenario A (8-bit sample representations).
Table A1. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 [7,17] for the full range of luma and chroma in Scenario A (8-bit sample representations).
First CycleSecond Cycle
ImagePSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
I0152.6856.6251.2152.6856.6251.21
I0252.1058.2451.4052.1058.2451.40
I0352.0056.8251.1052.0056.8251.09
I0452.3257.4851.6452.3257.4851.64
I0552.2857.4351.9852.2857.4351.98
I0652.5557.2851.4952.5557.2851.49
I0752.1556.5851.8152.1556.5851.81
I0852.4357.0651.6852.4357.0651.68
I0951.5458.4751.2951.5458.4751.29
I1052.3357.3151.7552.3357.3151.75
I1151.5557.5951.1051.5557.5951.10
I1251.8557.1551.6351.8557.1551.63
I1352.4157.0451.6752.4157.0451.67
I1452.4356.5352.0952.4356.5352.09
I1552.5357.0850.9552.5357.0850.94
I1651.8756.7450.6551.8756.7450.65
I1753.1556.7352.2753.1556.7352.27
I1852.3657.0051.8352.3657.0051.83
I1952.3957.0851.3452.3957.0851.34
I2054.2258.0351.6354.2158.0251.63
I2152.4458.0851.3052.4458.0851.30
I2252.5056.9751.8652.5056.9551.86
I2352.4957.4251.3452.4957.4251.33
I2452.4456.9751.7352.4456.9751.73
I2552.6858.9052.1652.6858.9052.16
Average 52.3957.3051.5652.3957.3051.56
Table A2. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the “television range” of luma and chroma in Scenario A (8-bit sample representations).
Table A2. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the “television range” of luma and chroma in Scenario A (8-bit sample representations).
First CycleSecond Cycle
ImagePSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
I0151.3255.3150.7451.3255.3150.74
I0251.7255.1751.1351.7255.1751.13
I0351.1555.9550.7051.1555.9550.69
I0451.4855.2250.8451.4855.2250.84
I0551.2455.2050.9551.2455.1950.95
I0651.8855.3850.6651.8855.3850.66
I0751.6455.1650.6551.6455.1650.65
I0851.7455.2250.7851.7355.2250.77
I0951.4254.8050.9051.4254.8050.90
I1051.9055.2151.1051.8955.2151.10
I1151.1554.9151.4451.1554.9151.44
I1251.5755.2450.4951.5755.2450.49
I1351.5155.2850.9351.5155.2850.93
I1451.7055.2750.8251.7055.2750.82
I1552.0255.2750.3252.0255.2750.32
I1651.4754.9250.4951.4754.9250.49
I1751.5655.2951.5551.5655.2951.55
I1851.7555.1350.9651.7555.1350.96
I1951.8255.2350.8651.8255.2250.86
I2052.0556.8750.8652.0456.8650.86
I2151.6955.1250.5351.6955.1250.53
I2251.7255.1450.6651.7155.1450.66
I2351.6255.3250.7051.6155.3250.69
I2451.4555.3650.7551.4555.3650.75
I2552.1555.9151.9452.1555.9151.50
Average 51.6355.3150.8751.6355.3150.85
Table A3. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.2020 [18] for the full range of luma and chroma in Scenario B (10-bit sample representations).
Table A3. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.2020 [18] for the full range of luma and chroma in Scenario B (10-bit sample representations).
First CycleSecond Cycle
Video ClipPSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
Traffic64.4669.1863.4164.4669.1863.41
ParkScene64.5368.5763.4764.5368.5763.47
OldTownCross64.5368.5163.4864.5368.5163.48
Kimono164.5168.6563.4364.5168.6563.43
EBURainFruits64.4168.9263.4564.4168.9163.45
EBULupoCandlelight64.4268.8663.4964.4268.8663.49
DucksAndLegs64.5368.5863.4764.5368.5863.47
BirdsInCage64.4268.9363.4664.4268.9363.46
Average64.4868.7763.4664.4868.7763.46
Table A4. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.2020 [18] for the “television range” of luma and chroma in Scenario B (10-bit sample representations).
Table A4. PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.2020 [18] for the “television range” of luma and chroma in Scenario B (10-bit sample representations).
First CycleSecond Cycle
Video ClipPSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
Traffic63.8666.7062.7163.8666.7062.71
ParkScene63.8066.6562.6763.8066.6562.67
OldTownCross63.8066.6862.6963.8066.6862.69
Kimono163.8266.6962.6963.8266.6962.69
EBURainFruits61.4762.6260.6961.4362.6060.65
EBULupoCandlelight57.9758.5357.6457.9658.5357.64
DucksAndLegs63.8266.6762.6963.8266.6762.69
BirdsInCage63.8366.6462.6963.8366.6462.69
Average62.8065.1561.8162.7965.1561.80
Table A5. Error histograms in R’, G’, B’ after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ for selected test images for Scenario A (ITU Rec. BT.709, 8-bit sample representations).
Table A5. Error histograms in R’, G’, B’ after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ for selected test images for Scenario A (ITU Rec. BT.709, 8-bit sample representations).
Image ComponentFirst CycleSecond Cycle
−2−1012−2−1012
I1Full range
“Full-swing”
R’036,449127,62032,5390036,449127,62032,5390
G’09617168,74918,242009617168,74918,2420
B’046,27499,79250,5420046,27699,79050,5420
“Television range”R’044,467102,27749,8640044,467102,27749,8640
G’019,983158,97917,6460019,983158,97917,6460
B’39955,31591,81748,47160639955,32291,81048,471606
I6Full range
“Full-swing”
R’032,724125,46838,4160032,724125,46738,4170
G’09759172,69914,150009760172,69814,1500
B’043,555105,91747,1360043,564105,90847,1360
“Television range”R’039,646113,74743,2150039,646113,69843,2640
G’018,408159,60218,5980018,425159,58418,5990
B’79348,82792,14053,86698279348,83492,13353,866982
I11Full range
“Full-swing”
R’034,789107,15154,6680034,789107,14954,6700
G’013,833174,33884370013,837174,33484370
B’059,95697,34939,3030059,96797,33839,3030
“Television range”R’056,92798,58641,0950056,92798,57941,1020
G’019,957155,31521,3360019,958155,31221,3380
B’67938,811108,16448,53442067938,823108,15248,534420
I16Full range
“Full-swing”
R’042,659113,47940,4700042,659113,47940,4700
G’08948169,51218,148008948169,51218,1480
B’044,77386,62065,2150044,77586,61865,2150
“Television range”R’047,864105,42143,3230047,864105,39243,3520
G’018,849155,38822,3710018,849155,38822,3710
B’167554,35789,68150,160735167554,36189,67750,160735
I21Full range
“Full-swing”
R’036,838123,75236,0180036,838123,73436,0360
G’08920176,70510,983008923176,70210,9830
B’044,794101,86849,9460044,794101,86849,9460
“Television range”R’043,437109,93443,2370043,437109,91543,2560
G’018,296157,26121,0510018,303157,25321,0520
B’46156,46086,98752,00269846156,46286,98552,002698
Table A6. Error histograms in R’, G’, B’ (values in thousands) in R’, G’, B’ after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ for selected test video clips for Scenario B (ITU Rec. BT.2020, 10-bit sample representations).
Table A6. Error histograms in R’, G’, B’ (values in thousands) in R’, G’, B’ after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ for selected test video clips for Scenario B (ITU Rec. BT.2020, 10-bit sample representations).
Clip ComponentFirst CycleSecond Cycle
−2−1012−2−1012
TrafficFull range R’0767,9632,562,275765,76200767,9632,562,275765,7620
G’0260,1933,578,089257,71800260,1933,578,089257,7180
B’0974,9512,140,594980,45500974,9512,140,594980,4550
“Television range”R’0880,2002,335,123880,67700880,2002,335,123880,6770
G’0458,2923,178,669459,03900458,2923,178,669459,0390
B’19,9921,065,4361,921,2271,069,16920,17619,9921,065,4361,921,2271,069,16920,176
ParkSceneFull range R’0379,4951,308,960385,14500379,4951,308,960385,1450
G’0153,1691,772,287148,14400153,1691,772,287148,1440
B’0487,9341,098,535487,13100487,9341,098,535487,1310
“Television range”R’0451,4671,169,463452,67000451,4671,169,463452,6700
G’0232,6441,604,643236,31300232,6441,604,643236,3130
B’12,074542,801964,331544,876951812,074542,801964,331544,8769518
OldTownCrossFull range R’0382,4241,309,311381,86500382,4241,309,311381,8650
G’0152,6581,768,029152,91300152,6581,768,029152,9130
B’0486,2351,098,702488,66300486,2351,098,702488,6630
“Television range”R’0452,0501,169,667451,88300452,0501,169,667451,8830
G’0232,6061,607,895233,09900232,6061,607,895233,0990
B’10,076543,274965,774544,10310,37310,076543,274965,774544,10310,373
KimonoFull rangeR’0382,6941,305,426385,48000382,6941,305,426385,4800
G’0148,8211,777,280147,49900148,8211,777,280147,4990
B’0491,0671,089,361493,17200491,0671,089,361493,1720
“Television range”R’0450,1701,172,094451,33600450,1701,172,094451,3360
G’0233,8141,608,944230,84200233,8141,608,944230,8420
B’10,064541,538970,784539,75111,46310,064541,538970,784539,75111,463
DucksAndLegsFull range R’0381,0741,309,300383,22600381,0741,309,300383,2260
G’0150,6661,772,794150,14000150,6661,772,794150,1400
B’0485,5541,098,621489,42500485,5541,098,621489,4250
“Television range”R’0450,3891,172,249450,96200450,3891,172,249450,9620
G’0234,3621,606,636232,60200234,3621,606,636232,6020
B’10,384543,620967,460542,07110,06510,384543,620967,460542,07110,065

Appendix B

Table A7. Scenario A: PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the full range of luma and chroma (8-bit sample representations). In the first cycle the values of luma and chroma are multiplied by the factor of 0.9 whereas they are multiplied by the factor of 1/0.9 in the second cycle.
Table A7. Scenario A: PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the full range of luma and chroma (8-bit sample representations). In the first cycle the values of luma and chroma are multiplied by the factor of 0.9 whereas they are multiplied by the factor of 1/0.9 in the second cycle.
First CycleSecond Cycle
ImagePSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
I0150.5253.6139.5650.5253.6139.54
I0248.1152.3232.3648.1152.3132.32
I0348.7652.7230.2748.7552.6830.23
I0445.1853.6538.2845.1853.6538.27
I0545.3152.9432.1745.3152.9332.14
I0649.5054.2041.2449.5054.1941.21
I0749.7452.7734.9549.7452.7534.92
I0848.8554.1548.2448.8554.0348.21
I0945.4753.0542.3045.4753.0542.29
I1045.6753.9444.6245.6753.9444.61
I1148.6352.8030.7548.6352.7830.64
I1249.5153.9547.0849.5053.9547.06
I1349.2553.6237.6349.2453.6237.60
I1449.4853.4135.0849.4853.3935.05
I1543.4853.7735.0443.4853.7635.03
I1650.1853.4043.7250.1853.4043.70
I1741.1150.9330.2741.1150.9330.25
I1843.8752.8732.5943.8752.8432.54
I1948.8153.8638.5748.8153.8238.54
I2045.9154.5735.9745.8854.5035.89
I2148.7353.4541.4648.7353.4441.45
I2249.4153.9144.5049.4053.8544.46
I2349.5353.6734.0049.5253.6434.00
I2449.4553.8048.0949.4553.7948.07
I2528.6854.4627.1328.6854.4527.13
Average 46.9353.4337.8446.9253.4137.81
Table A8. Scenario A: PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the “television range” of luma and chroma (8-bit sample representations). In the first cycle the values of luma and chroma are multiplied by the factor of 0.9 whereas they are multiplied by the factor of 1/0.9 in the second cycle.
Table A8. Scenario A: PSNR [dB] after one and two cycles of color transformations R’G’B’ → YCBCR → R’G’B’ according to ITU Rec. BT.709 for the “television range” of luma and chroma (8-bit sample representations). In the first cycle the values of luma and chroma are multiplied by the factor of 0.9 whereas they are multiplied by the factor of 1/0.9 in the second cycle.
First CycleSecond Cycle
ImagePSNR RPSNR GPSNR BPSNR RPSNR GPSNR B
I0148.2051.5435.6648.2051.5335.63
I0246.1849.6728.3246.1849.6628.27
I0348.0251.7827.5548.0151.7527.52
I0443.5451.6234.4543.3851.5934.41
I0541.6151.3529.2641.5951.3429.24
I0649.0852.6538.3749.0852.6338.31
I0748.6751.8531.8048.6751.8531.77
I0846.3652.4145.1646.3552.4145.13
I0944.0152.0840.1843.8352.0840.26
I1043.0351.9243.0742.8951.9243.23
I1145.8651.6928.0245.8651.6828.01
I1248.3552.5644.0048.3452.5643.96
I1348.2152.0034.3048.2051.9834.22
I1447.8251.7332.0847.8251.7032.02
I1539.2051.9031.8139.1951.8931.79
I1649.2952.3540.7549.2952.3540.68
I1735.4451.5227.3935.4451.5227.39
I1839.7051.2529.5439.7051.2429.52
I1947.4052.0535.5047.4052.0435.45
I2042.0253.7033.0142.0253.5232.95
I2146.9851.8338.1046.9851.8338.06
I2248.9252.4141.7248.9152.3741.66
I2348.7652.3232.0548.7552.2932.03
I2448.7352.5345.3948.7352.5345.34
I2527.1853.5825.4926.8953.5725.73
Average 44.9052.0134.9244.8751.9934.90

Appendix C

Table A9. Scenario A (ITU Rec. BT.709): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, AVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after AVC decompression with respect to the corresponding values before compression. The values obtained for the full range of luma and chroma values (8-bit sample representations).
Table A9. Scenario A (ITU Rec. BT.709): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, AVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after AVC decompression with respect to the corresponding values before compression. The values obtained for the full range of luma and chroma values (8-bit sample representations).
ImagePSNR RPSNR GPSNR BPSNR YPSNR CBPSNR CR
I0129.9130.4129.9730.5343.4041.99
I0231.6234.1132.9734.2243.1139.65
I0334.0335.3233.7035.5742.1342.67
I0433.3735.0334.1035.2545.4041.71
I0529.1229.9928.7330.1938.4639.01
I0630.4731.1630.2231.2741.2542.01
I0732.6133.9232.3534.2341.3141.06
I0829.2829.9828.9430.1240.1040.03
I0932.9833.6832.3733.9141.8743.08
I1033.0133.7232.3033.9941.8042.63
I1130.6931.7130.8631.8541.8840.77
I1232.9633.9332.8534.1343.5942.65
I1328.8929.2728.1029.3938.8841.65
I1429.4630.8129.3230.9838.4238.61
I1532.4134.1333.1534.3344.0240.95
I1633.5834.3133.3334.4243.9445.12
I1733.1333.6532.4233.8542.6243.46
I1830.2331.2729.7231.4739.1639.56
I1931.0031.8430.6931.9840.9441.39
I2033.6634.3232.4034.5341.4443.37
I2131.0131.6530.3231.7840.6242.20
I2231.0231.9630.6132.3640.3038.88
I2333.9335.0033.2435.3542.1941.99
I2431.3632.1930.6332.4840.3639.88
I2528.4332.2328.2432.3235.5034.29
Average 31.5332.6231.2632.8241.3141.14
Table A10. Scenario A (ITU Rec. BT.709): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, AVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after AVC decompression with respect to the corresponding values before compression. The values obtained for the “television range” of luma and chroma values (8-bit sample representations).
Table A10. Scenario A (ITU Rec. BT.709): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, AVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after AVC decompression with respect to the corresponding values before compression. The values obtained for the “television range” of luma and chroma values (8-bit sample representations).
ImagePSNR RPSNR GPSNR BPSNR YPSNR CBPSNR CR
I0129.3929.8029.3531.2443.9542.96
I0231.1933.6332.5235.0243.9540.49
I0333.6134.7633.2136.3342.7443.44
I0432.9334.5433.5736.0846.3142.33
I0528.3829.2027.9930.6839.0239.49
I0629.7930.4729.5631.8841.7642.36
I0732.0333.2331.7734.8742.1241.68
I0828.6629.3128.3330.7140.7540.74
I0932.3433.0231.7834.5542.4243.65
I1032.4333.1631.8334.7042.6143.25
I1130.1331.1030.3032.5542.7241.41
I1232.5133.4132.3734.9144.4343.39
I1328.1328.4627.4529.8539.6542.27
I1428.9430.2528.7931.7239.1939.40
I1531.8233.5132.6234.9744.4741.60
I1633.0233.7132.7735.1344.5445.89
I1732.6333.1331.9334.6243.3744.35
I1829.6230.5829.2032.0439.9140.39
I1930.3831.1730.0532.6141.5942.27
I2033.0733.7631.9235.2242.1244.10
I2130.2330.8529.6732.2741.2842.80
I2230.5731.3830.0833.0840.9539.73
I2333.2834.4532.7036.0742.8842.55
I2430.7631.5430.0933.0841.1140.51
I2527.8631.0327.4132.2834.8334.38
Average 30.9531.9830.6933.4641.9541.82
Table A11. Scenario B (ITU Rec. BT.2020): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, HEVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after HEVC decompression with respect to the corresponding values before compression. The values obtained for the full range of luma and chroma values (10-bit sample representations).
Table A11. Scenario B (ITU Rec. BT.2020): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, HEVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after HEVC decompression with respect to the corresponding values before compression. The values obtained for the full range of luma and chroma values (10-bit sample representations).
Video ClipPSNR RPSNR GPSNR BPSNR YPSNR CBPSNR CR
Traffic37.1038.6031.7839.7938.2842.43
ParkScene33.3635.9730.4238.1035.9737.73
OldTownCross31.5734.4229.0335.8834.3736.91
Kimono135.8838.7031.4840.6237.0840.52
EBURainFruits38.9839.6936.0340.5144.0544.98
EBULupoCandlelight40.3541.0737.2141.7244.3047.55
DucksAndLegs30.8332.9926.0935.8631.5334.52
BirdsInCage40.1039.6932.1242.5037.6843.06
Average36.0237.6431.7739.3737.9140.96
Table A12. Scenario B (ITU Rec. BT.2020): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, HEVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after HEVC decompression with respect to the corresponding values before compression. The values obtained for the “television range”of luma and chroma values (10-bit sample representations).
Table A12. Scenario B (ITU Rec. BT.2020): PSNR [dB] for R’, G’, B’ after one cycle of color transformation from R’G’B’ to YCBCR, HEVC compression, then decompression and reverse transformation to R’G’B’ with respect to the corresponding values. PSNR [dB] for Y, CB, CR after HEVC decompression with respect to the corresponding values before compression. The values obtained for the “television range”of luma and chroma values (10-bit sample representations).
Video ClipPSNR RPSNR GPSNR BPSNR YPSNR CBPSNR CR
Traffic36.7138.1631.5140.6339.2343.30
ParkScene33.0035.5430.2238.9236.9538.62
OldTownCross31.4434.2228.9937.0035.5037.97
Kimono135.5838.3631.3341.5238.1141.47
EBURainFruits38.4739.1535.6441.2944.8745.76
EBULupoCandlelight39.9340.7036.9542.6745.2148.38
DucksAndLegs30.5432.6925.9636.8132.5735.44
BirdsInCage39.8439.4732.0543.5638.7844.05
Average35.6937.2931.5840.3038.9041.87

References

  1. Wyszecki, G.; Stiles, W.S. Color Science: Concepts and Methods, Quantitative Data and Formulae, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1982. [Google Scholar]
  2. Sangwine, S.J.; Horne, R.E.N. (Eds.) The Colour Image Processing Handbook; Chapman & Hall: London, UK, 1998. [Google Scholar]
  3. Morovič, J. Color Gamut Mapping; Wiley: Chichester, UK, 2008. [Google Scholar]
  4. Dubois, E. The Structure and Properties of Color Spaces and the Representation of Color Images; Springer: Cham, Switzerland, 2010. [Google Scholar] [CrossRef]
  5. Bodrogi, P.; Khanh, T.Q. Illumination, Color and Imaging; Wiley-VCH: Weinheim, Germany, 2012. [Google Scholar]
  6. Provenzi, E. (Ed.) Color Image Processing; MDPI: Basel, Switzerland, 2018. [Google Scholar] [CrossRef]
  7. Poynton, C. Digital Video and HD, 2nd ed.; Morgan Kaufmann: Waltham, MA, USA, 2012. [Google Scholar]
  8. Jiménez, I.; Valdez-Rodríguez, J.E.; Moreno-Armendáriz, M.A. Color Space Comparison of Isolated Cervix Cells for Morphology Classification. AI 2025, 6, 261. [Google Scholar] [CrossRef]
  9. Chen, Z.; Liu, P.; Du, Y.; Luo, Y.; Zhang, W. Correlation Tracking via Self-Adaptive Fusion of Multiple Features. Information 2018, 9, 241. [Google Scholar] [CrossRef]
  10. Zhao, Y.; Wu, M.; Zhang, L.; Wang, J.; Wei, D. An Effective Feature Segmentation Algorithm for a Hyper-Spectral Facial Image. Information 2018, 9, 261. [Google Scholar] [CrossRef]
  11. Apicella, A.; Corazza, A.; Isgrò, F.; Vettigli, G. Integration of Context Information through Probabilistic Ontological Knowledge into Image Classification. Information 2018, 9, 252. [Google Scholar] [CrossRef]
  12. ISO/IEC JTC 1/SC 29/WG 4; AhG on Video Coding for Machines. MPEG Video. Doc. m73899. ISO: Geneva, Switzerland, 2025.
  13. ISO/IEC JTC 1/SC 29/WG 4; Common Test Conditions for Video Coding for Machines. MPEG Video. Doc. N730. ISO: Geneva, Switzerland, 2025.
  14. ISO/IEC JTC 1/SC 29/WG 4; AhG on Feature Coding for Machines. MPEG Video. Doc. 73872. ISO: Geneva, Switzerland, 2025.
  15. ITU-T SG21 WP3/21|ISO/IEC JTC 1/SC 29; JVET AHG Report: Test Model Software Development (AHG3), 40th Meeting, Geneva, Switzerland, 3–12 October 2025. Doc. JVET-AN0003-v1. ITU: Geneva, Switzerland, 2025.
  16. ITU-R Rec. BT.601-7; Studio Encoding Parameters of Digital Television for Standard 4:3 and Wide-Screen 16:9 Aspect Ratios. ITU: Geneva, Switzerland, 2011.
  17. ITU-R Rec. BT.709-6; Parameter Values for the HDTV Standards for Production and International Programme Exchange. ITU: Geneva, Switzerland, 2015.
  18. ITU-R. Rec. BT.2020-2; Parameter Values for Ultra-High Definition Television Systems for Production and International Programme Exchange. ITU: Geneva, Switzerland, 2015.
  19. Ma, C.; Liu, D.; Peng, X.; Li, L.; Wu, F. Convolutional Neural Network-Based Arithmetic Coding for HEVC Intra-Predicted Residues. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 1901–1916. [Google Scholar] [CrossRef]
  20. Klóska, D.; Dziembowski, A.; Grzelka, A.; Mieloch, D. On the Selection of Transmitted Views for Decoder-Side Depth Estimation. Appl. Sci. 2026, 16, 72. [Google Scholar] [CrossRef]
  21. Ai, D.; Wang, J.; He, T.; Yuan, H.; Liu, Y.; Ling, N. Temporal and Spatial Perception: A Novel Perceptual Rate-Distortion Optimization Method for H.266/VVC Encoding. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 8299–8313. [Google Scholar] [CrossRef]
  22. Lorkiewicz, M.; Stankiewicz, O.; Domański, M.; Hang, H.-M.; Peng, W.-H. Complexity-Efficiency Control With ANN-Based CTU Partitioning for Video Encoding. IEEE Access 2024, 12, 102536–102551. [Google Scholar] [CrossRef]
  23. Lorkiewicz, M.; Różek, S.; Stankiewicz, O.; Grajek, T.; Maćkowiak, S.; Domański, M. Video Coding for Machines With Neural-Network-Based Chroma Synthesis. IEEE Access. 2025, 13, 112777–112784. [Google Scholar] [CrossRef]
  24. Winkler, S. Digital Video Quality; Wiley: Chichester, UK, 2005. [Google Scholar]
  25. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
  26. ITU-T Rec, J.144; Objective Perceptual Video Quality Measurement Techniques for Digital Cable Television in the Presence of a Full Reference. ITU: Geneva, Switzerland, 2004.
  27. Netflix/VMAF. Available online: https://github.com/Netflix/vmaf (accessed on 5 December 2025).
  28. Mitra, S.K. Digital Signal Processing: A Computer-Based Approach, 4th ed.; McGraw-Hill: New York, NY, USA, 2011. [Google Scholar]
  29. Domański, M.; Klimaszewski, K.; Konieczny, J.; Kurc, M.; Ratajczak, R.; Siast, J.; Stankiewicz, O.; Stankowski, J.; Wegner, K. Image Coding Method. Publication US 2013/0129235A1. U.S. Patent US8761527, 24 June 2014. [Google Scholar]
  30. Ameur, Z.; Hamidouche, W.; François, E.; Radosavljević, M.; Menard, D.; Demarty, C.-H. Deep-Based Film Grain Removal and Synthesis. IEEE Trans. Image Process. 2023, 32, 5046–5059. [Google Scholar] [CrossRef] [PubMed]
  31. van Assche, S.; Philips, W.; Lemahieu, I. Lossless Compression of Pre-Press Images Using a Novel Color Decorrelation Technique. Pattern Recognit. 1999, 32, 435–441. [Google Scholar] [CrossRef]
  32. Andriani, S.; Calvagno, G. Lossless Compression of Colour Video Sequence using Optimal Prediction Theory-Octopus. In 2007 Data Compression Conference (DCC’07), Snowbird, UT, USA, 27–29 March 2007; IEEE: Piscataway, NJ, USA, 2007; p. 375. [Google Scholar] [CrossRef]
  33. Malvar, H.S.; Sullivan, G.J.; Srinivasan, S. Lifting-Based Reversible Color Transformations for Image Compression. Proc. SPIE 2008, 7073, 707307. [Google Scholar] [CrossRef]
  34. Strutz, T. Multiplierless Reversible Color Transforms and Their Automatic Selection for Image Data Compression. IEEE Trans. Circuits Syst. Video Technol. 2013, 23, 1249–1259. [Google Scholar] [CrossRef]
  35. Domański, M.; Rakowski, K. Color Transformations for Lossless Image Compression. In X European Signal Processing Conference, EUSIPCO 2000 Tampere, Finland, 4–8 September 2000; IEEE: Piscataway, NJ, USA, 2000; Volume III, pp. 1361–1364. [Google Scholar]
  36. Domański, M.; Rakowski, K. A Simple Technique for Near-Lossless Coding of Color Images. In IEEE International Symposium on Circuits and Systems, ISCAS, 2000, Geneva, Switzerland, 28–31 May 2000; IEEE: Piscataway, NJ, USA, 2000; Volume III, pp. 299–302. [Google Scholar]
  37. Domański, M.; Rakowski, K. Lossless and Near-Lossless Image Compression with Color Transformations. In IEEE International Conference of Image Processing, ICIP 2001, Thessaloniki, Greece, 7–10 October 2001; IEEE: Piscataway, NJ, USA, 2000; Volume III, pp. 454–457. [Google Scholar]
  38. Lee, Y.-L.; Tsai, W.-H. A New Secure Image Transmission Technique via Secret-Fragment-Visible Mosaic Images by Nearly Reversible Color Transformations. IEEE Trans. Circuits Syst. Video Technol. 2014, 24, 695–703. [Google Scholar] [CrossRef]
  39. Azimi, M.; Pourazad, M.T. A Novel Chroma Processing Scheme for Improved Color Accuracy of HDR Video Content. IEEE Trans. Broadcast. 2020, 66, 718–728. [Google Scholar] [CrossRef]
  40. Rakowski, K. Accumulation of Errors Caused by Image Compression and Color Transformations. Ph.D. Thesis, Poznań University of Technology, Poznań, Poland, 2004. [Google Scholar]
  41. ITU-R Report BT.2246-9; Present State of Ultra-High Definition Television. ITU: Geneva, Switzerland, 2025.
  42. Tekalp, A.M. Digital Video Processing, 2nd ed.; Prentice Hall: New York, NY, USA, 2015. [Google Scholar]
  43. Manjunath, B.; Salembier, P.; Sikora, T. (Eds.) Introduction to MPEG-7, Multimedia Content Description Interface; John Wiley & Sons: Chichester, UK, 2002. [Google Scholar]
  44. Kodak Lossless True Color Image Suite. Available online: https://r0k.us/graphics/kodak/ (accessed on 15 September 2025).
  45. Tampere Image Database 2013 TID2013, Version 1. Available online: https://www.ponomarenko.info/tid2013.htm (accessed on 15 September 2025).
  46. Ponomarenko, N.; Jin, L.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Image Database TID2013: Peculiarities, Results and Perspectives. Signal Process. Image Commun. 2015, 30, 57–77. [Google Scholar] [CrossRef]
  47. ISO/IEC IS 14496-10|ITU-T Rec. H.264; Generic Coding of Audio-Visual Objects, Part10: Advanced Video Coding. ITU: Geneva, Switzerland, 2024.
  48. Wiegand, T.; Sullivan, G.J.; Bjøntegaard, G.; Luthra, A. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
  49. FFMPEG. Available online: https://ffmpeg.org (accessed on 15 September 2025).
  50. Domański, M.; Grzelka, A.; Stankiewicz, O. Influence of color transformation on codec performance. In 40th Meeting, Geneva, Switzerland, 3–12 October 2025; Doc. JVET-AN0003-v1; Joint Video Experts Team (JVET) of ITU-T SG21 WP3/21 and ISO/IEC JTC 1/SC 29; ISO: Geneva, Switzerland, 2025. [Google Scholar]
  51. ISO/IEC IS 23008-2|ITU-T Rec. H.265; High Efficiency Coding and Media Delivery in Heterogeneous Environments—Part 2: High Efficiency Video Coding (HEVC). ISO: Geneva, Switzerland, 2013.
  52. Sullivan, G.J.; Ohm, J.-R.; Han, W.-J.; Wiegand, T. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
  53. Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG 16 WP 3 and ISO/IEC JTC 1/SC 29/WG 11. Common Test Conditions and Software Reference Configurations for HEVC Range Extensions, 12th Meeting, Geneva, Switzerland, 14–23 January 2013; Doc. JCTVC-L1006; ISO: Geneva, Switzerland, 2013. [Google Scholar]
  54. Available online: ftp://hevc@ftp.tnt.uni-hannover.de/testsequences/FrExt-candidate-sequences/ (accessed on 21 January 2016).
  55. ISO/IEC IS 23090-3|ITU-T Rec. H.266; Coded Representation of Immersive Media—Part 3: Versatile Video Coding, 2nd ed.; ISO: Geneva, Switzerland, 2022.
  56. ISO/IEC IS 10918-1|ITU-T Rec. T.81; Information Technology—Digital Compression and Coding of Continuous-Tone Still Images: Requirements and Guideline. ISO: Geneva, Switzerland, 1994.
  57. ISO/IEC IS 15444-1|ITU-T Rec. T.800; JPEG 2000 Image Coding System, Part 1: Core Coding System. 2nd ed. ISO: Geneva, Switzerland, 2004.
Figure 1. Two cycles of the color transformations R’G’B’ → YCBCR → R’G’B’.
Figure 1. Two cycles of the color transformations R’G’B’ → YCBCR → R’G’B’.
Information 17 00366 g001
Figure 2. Two cycles of the color transformations RGB → YCBCR → RGB where the YCBCR samples are multiplied by a constant factor of 0.9 in the first cycle and by 1/0.9 in the second cycle.
Figure 2. Two cycles of the color transformations RGB → YCBCR → RGB where the YCBCR samples are multiplied by a constant factor of 0.9 in the first cycle and by 1/0.9 in the second cycle.
Information 17 00366 g002
Figure 3. A cycle of the color transformations RGB → YCBCR → RGB where the YCBCR samples are encoded and then decoded by an AVC or HEVC codec. The rounding at the output of the video decoder is mentioned in this picture.
Figure 3. A cycle of the color transformations RGB → YCBCR → RGB where the YCBCR samples are encoded and then decoded by an AVC or HEVC codec. The rounding at the output of the video decoder is mentioned in this picture.
Information 17 00366 g003
Table 1. Theoretical properties of the BT.709: R’G’B’ → YCBCR → R’G’B’ color transformations.
Table 1. Theoretical properties of the BT.709: R’G’B’ → YCBCR → R’G’B’ color transformations.
Feature“Full-Swing”: 255T, 255S“Television-Range”: 219T, 219S
Reversibility (6)NoNo
Error bounds
LR, LG, LB (7)
LR = 1, LG = 1, LB = 1LR = 1, LG = 1, LB = 2
No error accumulation (9)BorderlineYes
Table 2. Test video R’G’B’ sequences used in experiments for Scenario B.
Table 2. Test video R’G’B’ sequences used in experiments for Scenario B.
Name of the Test Video SequenceFrame Rate [Hz]Frame Size
Traffic302560 × 1600
ParkScene241920 × 1080
OldTownCross501920 × 1080
Kimono1241920 × 1080
EBURainFruits501920 × 1080
EBULupoCandlelight501920 × 1080
DucksAndLegs601920 × 1080
BirdsInCage601920 × 1080
Table 3. The average PSNR [dB] after one or two cycles of R’G’B’ → YCBCR → R’G’B’ transformation. No compression is involved.
Table 3. The average PSNR [dB] after one or two cycles of R’G’B’ → YCBCR → R’G’B’ transformation. No compression is involved.
First CycleSecond Cycle
R’G’B’R’G’B’
Scenario A
(8-bit
samples)
Full range
“Full-swing”
52.3957.3051.5652.3957.3051.56
“Television range”51.6355.3150.8751.6355.3150.85
Scenario B
(10-bit
samples)
Full range
“Full-swing”
64.4868.7763.4664.4868.7763.46
“Television range”62.8065.1561.8162.7965.1561.80
Table 4. R’, G’, B’ error histograms (in thousands) for all images or sequences after one or two cycles of R’G’B’ → YCBCR → R’G’B’ transformation.
Table 4. R’, G’, B’ error histograms (in thousands) for all images or sequences after one or two cycles of R’G’B’ → YCBCR → R’G’B’ transformation.
ComponentFirst CycleSecond Cycle
−2−1012−2−1012
Scenario AFull range
“Full-swing”
R’091830599380091830589390
G’027443153260027443153260
B’0105426731189001054267211890
“Television range”R’0107227151128001072271511293
G’047439724700047439714700
B’18 12502392124015 18 12512382124915
Scenario BFull range
“Full-swing”
R’0346911,662348000346911,66234800
G’0128616,051127400128616,05112740
B’0438598174410004385981744100
“Television range”R’0403410,537404000403410,53740400
G’0209214,423209700209214,42320970
B’94486486904871929448648690487192
Table 5. The average PSNR [dB] in the R’G’B’ and YCBCR color spaces for the scheme from Figure 3, in which images are first encoded and then decoded using the AVC or HEVC codec.
Table 5. The average PSNR [dB] in the R’G’B’ and YCBCR color spaces for the scheme from Figure 3, in which images are first encoded and then decoded using the AVC or HEVC codec.
PSNR RGBPSNR YCBCR
R’G’B’AverageYCBCRAverage
Scenario A
(8-bit, AVC)
Full range
“Full-swing”
31.5332.6231.2631.8032.8241.3141.1438.42
“Television range”30.9531.9830.6931.2133.4641.9541.8239.08
Scenario B
(10-bit, HEVC)
Full range
“Full-swing”
36.0237.6431.7735.1439.3737.9140.9639.41
“Television range”35.6937.2931.5834.8540.3038.9041.8740.36
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Domański, M.; Grzelka, A.; Stankiewicz, O. Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems. Information 2026, 17, 366. https://doi.org/10.3390/info17040366

AMA Style

Domański M, Grzelka A, Stankiewicz O. Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems. Information. 2026; 17(4):366. https://doi.org/10.3390/info17040366

Chicago/Turabian Style

Domański, Marek, Adam Grzelka, and Olgierd Stankiewicz. 2026. "Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems" Information 17, no. 4: 366. https://doi.org/10.3390/info17040366

APA Style

Domański, M., Grzelka, A., & Stankiewicz, O. (2026). Color Transformations Resulting in Loss of Performance in Modern Video Compression Software Systems. Information, 17(4), 366. https://doi.org/10.3390/info17040366

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop