Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding

Hsu, Huang-Chun; Ding, Jian-Jiun

doi:10.3390/engproc2025098004

Open AccessProceeding Paper

Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding^†

by

Huang-Chun Hsu

and

Jian-Jiun Ding

^*

Graduate Institute of Communication Engineering, National Taiwan University, Taipei 106319, Taiwan

^*

Author to whom correspondence should be addressed.

^†

Presented at the 2024 4th International Conference on Social Sciences and Intelligence Management (SSIM 2024), Taichung, Taiwan, 20–22 December 2024.

Eng. Proc. 2025, 98(1), 4; https://doi.org/10.3390/engproc2025098004

Published: 4 June 2025

(This article belongs to the Proceedings of 2024 4th International Conference on Social Sciences and Intelligence Management (SSIM 2024))

Download

Browse Figures

Versions Notes

Abstract

In modern communication technology, short videos are increasingly used on social media platforms. The advancement of video codecs is pivotal in communication. In this study, we developed a new scheme to encode the residue of intraframes. For the H.264 baseline profile, we used context-based arithmetic variable-length coding (CAVLC) to encode the residue of integer transforms in a block-wise manner. In the developed method, the DC and AC coefficients are separated. In addition, context assignment, adaptive scanning, range increment, and mutual learning are adopted in a mixture of fixed-length and variable-length schemes, and block-wise compressions of the frequency table are applied to obtain improved compression rates. Compressing the frequency prevents CAVLC from being hindered by horizontally/vertically dominated blocks. The developed method outperforms CAVLC, with average reductions of 7.81, 8.58, and 7.88% in quarter common intermediate format (QCIF), common intermediate format (CIF), and full high-definition (FHD) inputs.

Keywords:

video codec; H.264; intraframe coding; CAVLC; context-based arithmetic coding

1. Introduction

The H.264/AVC codec was introduced in 2003 and has become one of the most widely used video codecs on the Internet [1,2,3,4]. Despite the advent of new video codecs, such as H.265/HEVC [5], AV1 [6,7,8,9,10], H.266/VVC [11,12], and VP9 [13], they are not used as widely as H.264. Such dominance is attributed to the royalty fees associated with alternative codecs, unresolved patent issues, and limitations in hardware and software support. The new codec HEVC excels in high-resolution video content. However, when assessing its performance on the Internet—mainly with a pixel size of 1920 × 1080—the disparities between HEVC and previous codecs can be minimized. A simplified intraframe code is shown in Figure 1. H.264 employs rate-distortion optimization (RDO) [14] as a determinant to select a solitary 16 × 16 macroblock or 16 individual 4 × 4 macroblocks. For each 16 × 16 macroblock or 4 × 4 sub-macroblock, the optimal mode is determined by evaluating the sum of absolute difference (SAD) for the available prediction modes (Figure 2). Both the distortion of the reconstructed blocks and the associated bit rates are considered to ensure a judicious trade-off between resolution and compression efficiency (1).

C o s t = S S D + λ \times R a t e,

(1)

where λ is the Lagrange multiplier determined by quantization parameters, calculated as

λ = 0.85 \cdot 2^{\frac{Q P - 12}{3}} .

(2)

Instead of using the discrete cosine transform [15], H.264 adopts an enhanced integer transform method [16] characterized by increased hardware compatibility. The enhanced integer transform method substitutes multiplication operations through bit shifting, thereby optimizing computational efficiency. In entropy coding, CAVLC [17] scans the 16 coefficients in reverse order and exploits the characteristics of AC coefficients by using an adaptive suffix length. A context-based look-up table is determined by using non-zero values in neighboring blocks. However, the suffix length changes immediately when encountering high-level numbers. The method developed in this study replaces this analogous concept by dynamically adjusting the frequency table and transitioning it to a new diagonal line.

2. Proposed Intraframe Coding Scheme

2.1. Preprocessing

The proposed method maps the signed quantized integer transform coefficients according to their parity (3).

C o e f = \{\begin{matrix} 2 | C o e f |, if C o e f \leq 0 \\ 2 | C o e f | - 1, if C o e f > 0 \end{matrix} .

(3)

The coefficients are grouped by their numerical values, as shown in (4).

G r o u p = c e i l i n g (\log_{2} (C o e f + 2))

(4)

With additional bits equal to group numbers minus 1, the position in the group is determined similarly to the Huffman table for the joint photographic expert group (JPEG) format [14,18]. In the categorization, the input is divided into variable-length and fixed-length segments, diminishing coding efficiency. This concurrently mitigates the substantial memory demands associated with the frequency table. Values exceeding 254 are not categorized but, instead, are placed into a singular group due to their relatively low probability. In such cases, their additional bits are set as 12 to handle values within 4096.

2.2. Context Modeling

Context modeling is a prevalent technique used in the context of arithmetic coding [19]. The likelihood of the current input is predicted based on the influence of preceding inputs. Context modeling mitigates spatial redundancy. For instance, the letter “u” is most likely to succeed the letter “q” in English text. Figure 3 illustrates the DC and AC coefficients extracted from the Akiyo.cif dataset. The contours observed in the DC coefficients suggest remaining spatial redundancy.

Another notable observation is the tendency for higher AC coefficients to appear in blocks of higher DC coefficients. Consequently, DC coefficients are determined by the median value of the DC coefficient group numbers from the left, top, and top-left blocks (5).

C o n t e x t = m e d (D C (i, j - 1), D C (i - 1, j), D C (i - 1, j - 1)) .

(5)

Only the left or top block is chosen when either one is not available, or the context is set as 1 if both are inaccessible. The AC coefficients are chosen as DC(i, j).

2.3. AC Frequency Table

In CAVLC, the input sequence is scanned in reverse order, leveraging the inherent characteristics of AC coefficients. The absolute values of AC coefficients exhibit a diminishing trend from the top-left to the bottom-right of the matrix. The values are encoded in two parts: the prefix and suffix. The initial suffix length is set at 0, with subsequent increments when the input level value surpasses the threshold, as illustrated in Table 1. The bit length for lower levels increases, while that for larger values decreases (Table 2).

The suffix length is augmented to transform the probability distribution from a sharp peak at the origin to a comparatively smooth distribution. The process is irreversible, meaning that if the input sequence deviates from a perfectly ascending order, losses occur because the zig-zag scan order causes the horizontal/vertical terms to become unsynchronized. For example, if the input block is horizontally dominated, as shown in Table 3, the input sequence becomes “4, 6, 4, 4, 1, 1, 1, 3, 1, 2, 1, …”. The suffix length is extended to 1 upon encountering 3, and the 1s in front of it are encoded in a longer bitstream. To mitigate the bias of horizontal or vertical dominance, the developed method adopts a forward scan order, as shown in Table 4. The frequency table is compressed only when all the inputs in the previous diagonal line are encoded (Algorithm 1).

Algorithm 1: Proposed adaptive frequency table squeezing method.

for idx=1:15
if idx==1
frequency_table(1:DC(i,j))*=W
(Note: *= W means that the lefthanded side is multiplied by W)
elseif idx==3
frequency_table(1:max(input(1:2)))*=X
elseif idx==6
frequency_table(1: max(input(3:5)))*=Y
elseif idx==10
frequency_table(1: max(input(6:9)))*=Z
end

The compression of the frequency table ceases following the longest diagonal line, as this is predominantly composed of these terms. Furthermore, W, X, Y, and Z are adjustable parameters and are arranged in ascending order. This arrangement stems from the rationale that, as the index approaches the bottom-right region, the likelihood of inputs falling within a smaller interval increases. The developed method resolves the issue depicted in Table 3 and Table 4, as none of the element values surpass the maximum value observed in the preceding diagonal line.

2.4. Arithmetic Coding

The frequency table is initialized in a geometric sequence (6) to facilitate more rapid adaptation to the real probability distribution.

F [x] = C \cdot α^{- x}, α > 1,

(6)

When an input is encoded, increments are applied to the corresponding position within the frequency table and neighboring values, albeit with a smaller magnitude. The mathematical form is given as follows.

F [c, x - d : x + d] = F [c, x - d : x + d] + A m p l i t u d e \cdot e^{- σ d} .

(7)

The concept of mutual learning is similar to that of range increment, but the adjustment is made according to the neighboring contexts with identical positioning.

F [c - d : c + d, x] = F [c - d : c + d, x] + A m p l i t u d e \cdot e^{- σ d}

(8)

The objective of both the range increment and mutual learning approaches is to alleviate the redundancy caused by the histogram’s monotonicity of neighboring elements and the similarity of the probability distribution among nearby contexts.

3. Results

We selected 10 CIF (352 × 288), 7 QCIF (176 × 144), and 9 FHD (1920 × 1080) inputs [20,21,22]. The prediction method was the same as that presented in Ref. [23], while other information—such as the prediction type and other headers—was neglected. An exhaustive search was applied, and the QP for RDO was set as 15. Only the bitstreams that recorded the residue were considered. The whole program was run in MATLAB R2020a. Table 5 displays the bit length comparison between CAVLC and the proposed method for CIF. All of the test files exhibited better performance when employing our proposed method.

The inputs Akiyo, mother, and news demonstrated suboptimal performance. Evident common characteristics are revealed by their histograms, with a significant decline in distribution beyond the origin. Table 6 presents the bit lengths for QCIF. The developed proposed method showed superior performance over CAVLC for all inputs. However, the average reduction in bit length was less than that observed in the CIF. This phenomenon is rationalized by the fact that even the minimal block size of H.264 intraframe prediction—namely, 4 × 4—occupies a significant range within a 144 × 176 region. Consequently, the bit length is constrained by the inaccuracies inherent in prediction.

Table 7 shows the results for FHD inputs. Most of the inputs were reduced by more than 6%, while west wind and time lapse had only slight improvements. The reason for this is the mechanism of run-length encoding (RLE). In CAVLC, the information of zero coefficients and the last three 1s are encoded as zeros_left, run_before, and trailing ones. If the inputs are composed of mostly 0s and few 1s, CAVLC realizes a short bit rate. However, unlike CAVLC, the developed method discards the concept of zero RLE and, therefore, did not perform well with the two inputs that have a wide range of smooth regions.

4. Conclusions

In this study, we developed an intraframe coding method based on H.264 using context-based arithmetic coding together with several improvement techniques. By dividing the DC/AC coefficients into two coding stages, appropriate contexts were chosen step-by-step in a compression frequency table, based on the maximum value in the previous diagonal line. With context-based arithmetic coding techniques, the developed method reduced the bit lengths of the QCIF, CIF, and FHD input sizes by 7.81%, 8.58, and 7.88%, respectively, when compared with those achieved using the classic CAVLC with the baseline profile of H.264. As the demand for high-resolution videos grows, the improvement and development of video codecs continue. As H.264 is no longer sufficient for present requirements, newer encoding technologies for higher resolutions need to be developed. As the developed method is compatible with H.264, this method can be used in future codecs.

Author Contributions

Conceptualization, H.-C.H. and J.-J.D.; methodology, H.-C.H.; software, H.-C.H.; validation, H.-C.H.; formal analysis, H.-C.H. and J.-J.D.; investigation, H.-C.H.; resources, H.-C.H.; data curation, H.-C.H.; writing—original draft, H.-C.H.; writing—review and editing, H.-C.H. and J.-J.D.; visualization, H.-C.H.; supervision, J.-J.D.; project administration, J.-J.D.; funding acquisition, J.-J.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funded by the National Science and Technology Council, Taiwan, R.O.C., grant number MOST 110-2221-E-002-092-MY3.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of this article are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Mansri, I.; Doghmane, N.; Kouadria, N.; Harize, S.; Bekhouch, A. Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications. In Proceedings of the 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), Valencia, Spain, 19–22 October 2020; pp. 38–43. [Google Scholar]
Sullivan, G.J.; Wiegand, T. Video compression: From concepts to the H.264/AVC standard. Proc. IEEE 2005, 93, 18–31. [Google Scholar] [CrossRef]
Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
Luthra, A.; Sullivan, G.J.; Wiegand, T. Introduction to the special issue on the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 557–559. [Google Scholar] [CrossRef]
Pastuszak, G.; Abramowski, A. Algorithm and architecture design of the H. 265/HEVC intra encoder. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 210–222. [Google Scholar] [CrossRef]
Lainema, J.; Bossen, F.; Han, W.J.; Min, J.; Ugur, K. Intra coding of the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1792–1801. [Google Scholar] [CrossRef]
Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
Saldanha, M.; Corrêa, M.; Corrêa, G.; Palomino, D.; Porto, M.; Zatt, B.; Agostini, L. An overview of dedicated hardware designs for state-of-the-art AV1 and H.266/VVC video codecs. In Proceedings of the 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Glasgow, UK, 23–25 November 2020; pp. 1–4. [Google Scholar]
Mukherjee, D.; Bankoski, J.; Grange, A.; Han, J.; Koleszar, J.; Wilkins, P.; Xu, Y.; Bultje, R. The latest open-source video codec VP9—An overview and preliminary results. In Proceedings of the 2013 Picture Coding Symposium (PCS), San Jose, CA, USA, 8–11 December 2013; pp. 390–393. [Google Scholar]
Chen, Y.; Murherjee, D.; Han, J.; Grange, A.; Xu, Y.; Liu, Z.; Parker, S.; Chen, C.; Su, H.; Joshi, U.; et al. An overview of core coding tools in the AV1 video codec. In Proceedings of the 2018 Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; pp. 41–45. [Google Scholar]
Viitanen, M.; Sainio, J.; Mercat, A.; Lemmetti, A.; Vanne, J. From HEVC to VVC: The first development steps of a practical intra video encoder. IEEE Trans. Consum. Electron. 2022, 68, 139–148. [Google Scholar] [CrossRef]
Bross, B.; Chen, J.; Ohm, J.R.; Sullivan, G.J.; Wang, Y.K. Developments in international video coding standardization after AVC, with an overview of versatile video coding (VVC). Proc. IEEE 2021, 109, 1463–1493. [Google Scholar] [CrossRef]
Grange, A.; de Rivaz, P.; Hunt, J. VP9 Bitstream & Decoding Process Specification. Available online: https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf (accessed on 29 February 2024).
Sullivan, G.J.; Wiegand, T. Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 1998, 15, 74–90. [Google Scholar] [CrossRef]
Ahmed, N.; Natarajan, T.; Rao, K.R. Discrete cosine transform. IEEE Trans. Comput. 1974, 23, 90–93. [Google Scholar] [CrossRef]
Malvar, H.S.; Hallapuro, A.; Karczewicz, M.; Kerofsky, L. Low-complexity transform and quantization in H.264/AVC. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 598–603. [Google Scholar] [CrossRef]
Ghasempour, M.; Ghanbari, M. A low complexity system for multiple data embedding into H.264 coded video bit-stream. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 4009–4019. [Google Scholar] [CrossRef]
Wallace, G.K. The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 1992, 38, 18–34. [Google Scholar] [CrossRef]
Ding, J.J.; Wang, I.H.; Chen, H.Y. Improved efficiency on adaptive arithmetic coding for data compression using range-adjusting scheme, increasingly adjusting step, and mutual-learning scheme. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 3412–3423. [Google Scholar] [CrossRef]
Internet Archive. 2023. Available online: https://web.archive.org/web/20230509144046/http://trace.eas.asu.edu/yuv/index.html (accessed on 29 February 2024).
1920x1080.yuv Images for AVC Codec. 2017. Available online: https://github.com/ireader/avcodec/blob/master/libavo/test/1920x1080.yuv (accessed on 29 February 2024).
Index of Video. 2010. Available online: https://media.xiph.org/video (accessed on 29 February 2024).
Richardson, I.E.G. H. 264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia; John Wiley & Sons: West Sussex, UK, 2004. [Google Scholar]

Figure 1. H.264 intraframe coding flow.

Figure 2. Intra-prediction mode for 4 × 4 and 16 × 6 blocks (for 16 × 16 blocks, only 0–3 are available). Since Mode 2 corresponds to the DC mode, there is no movement and no arrow for Mode 2.

Figure 3. DC coefficients (left) and DC and AC coefficients (right) of Akiyo.cif with sizes of 72 × 88 and 288 × 352, respectively.

Table 1. Relationship between threshold and suffix length.

Current Suffix Length	Threshold for Increasing Suffix Length
0	0
1	3
2	6
3	12
4	24
5	48
6	N/A (highest)

Table 2. Parts of level code with corresponding suffix length.

Level	Suffix Length = 0	Level	Suffix Length = 1
1	1	1	10
−1	01	−1	11
2	001	2	010
−2	0001	−2	011
3	00001	3	010
−3	000001	−3	0011
…	…	…	…
−7	00000000000001	14	000000000000010
$\pm 8 ~ \pm 15$	000000000000001xxxx	−14	000000000000011
$> \pm 16$	0000000000000001xxxxxxxxxxxx	15	0000000000000010
		−15	0000000000000011
		$> \pm 15$	0000000000000001xxxxxxxxxxxx

xxxx and xxxxxxxxxxxx indicate extra bits.

Table 3. An example of the input of scanning.

X	6	4	1
4	4	3	2
1	1	1	1
1	1	1	1

Table 4. Adaptive scanning order in the proposed method.

X	2	3	9
1	4	8	10
5	7	11	14
6	12	13	15

Table 5. Bit length comparison of CAVLC and proposed method for CIF inputs.

Input	CAVLC	Proposed	Reduction
Akiyo	177,925	171,230	−3.18%
bridge-close	370,383	331,572	−10.48%
bridge-far	301,698	257,717	−14.58%
bus	399,413	362,545	−9.23%
foreman	275,447	251,234	−8.79%
flower	359,706	326,845	−9.14%
mother	187,798	181,465	−3.37%
news	232,595	222,265	−4.44%
silent	342,727	302,813	−11.64%
waterfall	400,367	356,244	−11.02%
Average bit reduction			−8.58%

Table 6. Bit length comparison of CAVLC and proposed method for QCIF inputs.

Input	CAVLC	Proposed	Reduction
Akiyo	52,938	51,761	−2.22%
bridge-close	103,432	91,247	−11.78%
bridge-far	77,659	67,597	−12.96%
foreman	79,643	73,088	−8.23%
mother	59,212	56,002	−5.42%
news	72,418	69,272	−4.34%
silent	88,899	80,225	−9.76%
Average bit reduction			−7.81%

Table 7. Bit length comparison of CAVLC and the proposed method for FHD inputs.

Input	CAVLC	Proposed	Reduction
AOV5	4,711,101	4,405,510	−6.49%
Time lapse	3,030,756	2,882,682	−4.89%
camera1	2,979,962	2,772,831	−6.95%
west wind	3,353,710	3,270,767	−2.47%
rush fields	5,436,880	5,000,844	−8.02%
controlled burn	6,905,301	6,314,721	−8.55%
life	5,374,526	4,741,603	−11.78%
pedestrian	3,422,777	3,089,042	−9.75%
park joy	6,513,638	5,729,934	−12.03%
Average bit reduction			−7.88%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hsu, H.-C.; Ding, J.-J. Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Eng. Proc. 2025, 98, 4. https://doi.org/10.3390/engproc2025098004

AMA Style

Hsu H-C, Ding J-J. Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Engineering Proceedings. 2025; 98(1):4. https://doi.org/10.3390/engproc2025098004

Chicago/Turabian Style

Hsu, Huang-Chun, and Jian-Jiun Ding. 2025. "Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding" Engineering Proceedings 98, no. 1: 4. https://doi.org/10.3390/engproc2025098004

APA Style

Hsu, H.-C., & Ding, J.-J. (2025). Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Engineering Proceedings, 98(1), 4. https://doi.org/10.3390/engproc2025098004

Article Menu

Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding^†

Abstract

1. Introduction