Next Article in Journal
Ancient Science: From Effects to Ballistics Parameters
Previous Article in Journal
Primitive Shape Fitting of Stone Projectiles in Siege Weapons: Geometric Analysis of Roman Artillery Ammunition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding †

Graduate Institute of Communication Engineering, National Taiwan University, Taipei 106319, Taiwan
*
Author to whom correspondence should be addressed.
Presented at the 2024 4th International Conference on Social Sciences and Intelligence Management (SSIM 2024), Taichung, Taiwan, 20–22 December 2024.
Eng. Proc. 2025, 98(1), 4; https://doi.org/10.3390/engproc2025098004
Published: 4 June 2025

Abstract

:
In modern communication technology, short videos are increasingly used on social media platforms. The advancement of video codecs is pivotal in communication. In this study, we developed a new scheme to encode the residue of intraframes. For the H.264 baseline profile, we used context-based arithmetic variable-length coding (CAVLC) to encode the residue of integer transforms in a block-wise manner. In the developed method, the DC and AC coefficients are separated. In addition, context assignment, adaptive scanning, range increment, and mutual learning are adopted in a mixture of fixed-length and variable-length schemes, and block-wise compressions of the frequency table are applied to obtain improved compression rates. Compressing the frequency prevents CAVLC from being hindered by horizontally/vertically dominated blocks. The developed method outperforms CAVLC, with average reductions of 7.81, 8.58, and 7.88% in quarter common intermediate format (QCIF), common intermediate format (CIF), and full high-definition (FHD) inputs.

1. Introduction

The H.264/AVC codec was introduced in 2003 and has become one of the most widely used video codecs on the Internet [1,2,3,4]. Despite the advent of new video codecs, such as H.265/HEVC [5], AV1 [6,7,8,9,10], H.266/VVC [11,12], and VP9 [13], they are not used as widely as H.264. Such dominance is attributed to the royalty fees associated with alternative codecs, unresolved patent issues, and limitations in hardware and software support. The new codec HEVC excels in high-resolution video content. However, when assessing its performance on the Internet—mainly with a pixel size of 1920 × 1080—the disparities between HEVC and previous codecs can be minimized. A simplified intraframe code is shown in Figure 1. H.264 employs rate-distortion optimization (RDO) [14] as a determinant to select a solitary 16 × 16 macroblock or 16 individual 4 × 4 macroblocks. For each 16 × 16 macroblock or 4 × 4 sub-macroblock, the optimal mode is determined by evaluating the sum of absolute difference (SAD) for the available prediction modes (Figure 2). Both the distortion of the reconstructed blocks and the associated bit rates are considered to ensure a judicious trade-off between resolution and compression efficiency (1).
C o s t = S S D + λ × R a t e ,
where λ is the Lagrange multiplier determined by quantization parameters, calculated as
λ = 0.85 2 Q P 12 3 .
Instead of using the discrete cosine transform [15], H.264 adopts an enhanced integer transform method [16] characterized by increased hardware compatibility. The enhanced integer transform method substitutes multiplication operations through bit shifting, thereby optimizing computational efficiency. In entropy coding, CAVLC [17] scans the 16 coefficients in reverse order and exploits the characteristics of AC coefficients by using an adaptive suffix length. A context-based look-up table is determined by using non-zero values in neighboring blocks. However, the suffix length changes immediately when encountering high-level numbers. The method developed in this study replaces this analogous concept by dynamically adjusting the frequency table and transitioning it to a new diagonal line.

2. Proposed Intraframe Coding Scheme

2.1. Preprocessing

The proposed method maps the signed quantized integer transform coefficients according to their parity (3).
C o e f = 2 | C o e f | ,   if   C o e f 0 2 | C o e f | 1 ,   if   C o e f > 0 .
The coefficients are grouped by their numerical values, as shown in (4).
G r o u p = c e i l i n g ( log 2 ( C o e f + 2 ) )
With additional bits equal to group numbers minus 1, the position in the group is determined similarly to the Huffman table for the joint photographic expert group (JPEG) format [14,18]. In the categorization, the input is divided into variable-length and fixed-length segments, diminishing coding efficiency. This concurrently mitigates the substantial memory demands associated with the frequency table. Values exceeding 254 are not categorized but, instead, are placed into a singular group due to their relatively low probability. In such cases, their additional bits are set as 12 to handle values within 4096.

2.2. Context Modeling

Context modeling is a prevalent technique used in the context of arithmetic coding [19]. The likelihood of the current input is predicted based on the influence of preceding inputs. Context modeling mitigates spatial redundancy. For instance, the letter “u” is most likely to succeed the letter “q” in English text. Figure 3 illustrates the DC and AC coefficients extracted from the Akiyo.cif dataset. The contours observed in the DC coefficients suggest remaining spatial redundancy.
Another notable observation is the tendency for higher AC coefficients to appear in blocks of higher DC coefficients. Consequently, DC coefficients are determined by the median value of the DC coefficient group numbers from the left, top, and top-left blocks (5).
C o n t e x t = m e d ( D C ( i , j 1 ) ,   D C ( i 1 , j ) ,   D C ( i 1 , j 1 ) ) .
Only the left or top block is chosen when either one is not available, or the context is set as 1 if both are inaccessible. The AC coefficients are chosen as DC(i, j).

2.3. AC Frequency Table

In CAVLC, the input sequence is scanned in reverse order, leveraging the inherent characteristics of AC coefficients. The absolute values of AC coefficients exhibit a diminishing trend from the top-left to the bottom-right of the matrix. The values are encoded in two parts: the prefix and suffix. The initial suffix length is set at 0, with subsequent increments when the input level value surpasses the threshold, as illustrated in Table 1. The bit length for lower levels increases, while that for larger values decreases (Table 2).
The suffix length is augmented to transform the probability distribution from a sharp peak at the origin to a comparatively smooth distribution. The process is irreversible, meaning that if the input sequence deviates from a perfectly ascending order, losses occur because the zig-zag scan order causes the horizontal/vertical terms to become unsynchronized. For example, if the input block is horizontally dominated, as shown in Table 3, the input sequence becomes “4, 6, 4, 4, 1, 1, 1, 3, 1, 2, 1, …”. The suffix length is extended to 1 upon encountering 3, and the 1s in front of it are encoded in a longer bitstream. To mitigate the bias of horizontal or vertical dominance, the developed method adopts a forward scan order, as shown in Table 4. The frequency table is compressed only when all the inputs in the previous diagonal line are encoded (Algorithm 1).
Algorithm 1: Proposed adaptive frequency table squeezing method.
 for idx=1:15
  if idx==1
   frequency_table(1:DC(i,j))*=W
   (Note: *= W means that the lefthanded side is multiplied by W)
  elseif idx==3
   frequency_table(1:max(input(1:2)))*=X
  elseif idx==6
   frequency_table(1: max(input(3:5)))*=Y
  elseif idx==10
   frequency_table(1: max(input(6:9)))*=Z
 end
The compression of the frequency table ceases following the longest diagonal line, as this is predominantly composed of these terms. Furthermore, W, X, Y, and Z are adjustable parameters and are arranged in ascending order. This arrangement stems from the rationale that, as the index approaches the bottom-right region, the likelihood of inputs falling within a smaller interval increases. The developed method resolves the issue depicted in Table 3 and Table 4, as none of the element values surpass the maximum value observed in the preceding diagonal line.

2.4. Arithmetic Coding

The frequency table is initialized in a geometric sequence (6) to facilitate more rapid adaptation to the real probability distribution.
F [ x ] = C · α x ,   α > 1 ,
When an input is encoded, increments are applied to the corresponding position within the frequency table and neighboring values, albeit with a smaller magnitude. The mathematical form is given as follows.
F [ c ,   x d :   x + d ] = F [ c ,   x d :   x + d ] + A m p l i t u d e · e σ d .
The concept of mutual learning is similar to that of range increment, but the adjustment is made according to the neighboring contexts with identical positioning.
F [ c d : c + d ,   x ] = F [ c d : c + d ,   x ] + A m p l i t u d e · e σ d
The objective of both the range increment and mutual learning approaches is to alleviate the redundancy caused by the histogram’s monotonicity of neighboring elements and the similarity of the probability distribution among nearby contexts.

3. Results

We selected 10 CIF (352 × 288), 7 QCIF (176 × 144), and 9 FHD (1920 × 1080) inputs [20,21,22]. The prediction method was the same as that presented in Ref. [23], while other information—such as the prediction type and other headers—was neglected. An exhaustive search was applied, and the QP for RDO was set as 15. Only the bitstreams that recorded the residue were considered. The whole program was run in MATLAB R2020a. Table 5 displays the bit length comparison between CAVLC and the proposed method for CIF. All of the test files exhibited better performance when employing our proposed method.
The inputs Akiyo, mother, and news demonstrated suboptimal performance. Evident common characteristics are revealed by their histograms, with a significant decline in distribution beyond the origin. Table 6 presents the bit lengths for QCIF. The developed proposed method showed superior performance over CAVLC for all inputs. However, the average reduction in bit length was less than that observed in the CIF. This phenomenon is rationalized by the fact that even the minimal block size of H.264 intraframe prediction—namely, 4 × 4—occupies a significant range within a 144 × 176 region. Consequently, the bit length is constrained by the inaccuracies inherent in prediction.
Table 7 shows the results for FHD inputs. Most of the inputs were reduced by more than 6%, while west wind and time lapse had only slight improvements. The reason for this is the mechanism of run-length encoding (RLE). In CAVLC, the information of zero coefficients and the last three 1s are encoded as zeros_left, run_before, and trailing ones. If the inputs are composed of mostly 0s and few 1s, CAVLC realizes a short bit rate. However, unlike CAVLC, the developed method discards the concept of zero RLE and, therefore, did not perform well with the two inputs that have a wide range of smooth regions.

4. Conclusions

In this study, we developed an intraframe coding method based on H.264 using context-based arithmetic coding together with several improvement techniques. By dividing the DC/AC coefficients into two coding stages, appropriate contexts were chosen step-by-step in a compression frequency table, based on the maximum value in the previous diagonal line. With context-based arithmetic coding techniques, the developed method reduced the bit lengths of the QCIF, CIF, and FHD input sizes by 7.81%, 8.58, and 7.88%, respectively, when compared with those achieved using the classic CAVLC with the baseline profile of H.264. As the demand for high-resolution videos grows, the improvement and development of video codecs continue. As H.264 is no longer sufficient for present requirements, newer encoding technologies for higher resolutions need to be developed. As the developed method is compatible with H.264, this method can be used in future codecs.

Author Contributions

Conceptualization, H.-C.H. and J.-J.D.; methodology, H.-C.H.; software, H.-C.H.; validation, H.-C.H.; formal analysis, H.-C.H. and J.-J.D.; investigation, H.-C.H.; resources, H.-C.H.; data curation, H.-C.H.; writing—original draft, H.-C.H.; writing—review and editing, H.-C.H. and J.-J.D.; visualization, H.-C.H.; supervision, J.-J.D.; project administration, J.-J.D.; funding acquisition, J.-J.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funded by the National Science and Technology Council, Taiwan, R.O.C., grant number MOST 110-2221-E-002-092-MY3.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data of this article are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Mansri, I.; Doghmane, N.; Kouadria, N.; Harize, S.; Bekhouch, A. Comparative evaluation of VVC, HEVC, H. 264, AV1, and VP9 encoders for low-delay video applications. In Proceedings of the 2020 Fourth International Conference on Multimedia Computing, Networking and Applications (MCNA), Valencia, Spain, 19–22 October 2020; pp. 38–43. [Google Scholar]
  2. Sullivan, G.J.; Wiegand, T. Video compression: From concepts to the H.264/AVC standard. Proc. IEEE 2005, 93, 18–31. [Google Scholar] [CrossRef]
  3. Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A. Overview of the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
  4. Luthra, A.; Sullivan, G.J.; Wiegand, T. Introduction to the special issue on the H.264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 557–559. [Google Scholar] [CrossRef]
  5. Pastuszak, G.; Abramowski, A. Algorithm and architecture design of the H. 265/HEVC intra encoder. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 210–222. [Google Scholar] [CrossRef]
  6. Lainema, J.; Bossen, F.; Han, W.J.; Min, J.; Ugur, K. Intra coding of the HEVC standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1792–1801. [Google Scholar] [CrossRef]
  7. Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
  8. Saldanha, M.; Corrêa, M.; Corrêa, G.; Palomino, D.; Porto, M.; Zatt, B.; Agostini, L. An overview of dedicated hardware designs for state-of-the-art AV1 and H.266/VVC video codecs. In Proceedings of the 2020 27th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Glasgow, UK, 23–25 November 2020; pp. 1–4. [Google Scholar]
  9. Mukherjee, D.; Bankoski, J.; Grange, A.; Han, J.; Koleszar, J.; Wilkins, P.; Xu, Y.; Bultje, R. The latest open-source video codec VP9—An overview and preliminary results. In Proceedings of the 2013 Picture Coding Symposium (PCS), San Jose, CA, USA, 8–11 December 2013; pp. 390–393. [Google Scholar]
  10. Chen, Y.; Murherjee, D.; Han, J.; Grange, A.; Xu, Y.; Liu, Z.; Parker, S.; Chen, C.; Su, H.; Joshi, U.; et al. An overview of core coding tools in the AV1 video codec. In Proceedings of the 2018 Picture Coding Symposium (PCS), San Francisco, CA, USA, 24–27 June 2018; pp. 41–45. [Google Scholar]
  11. Viitanen, M.; Sainio, J.; Mercat, A.; Lemmetti, A.; Vanne, J. From HEVC to VVC: The first development steps of a practical intra video encoder. IEEE Trans. Consum. Electron. 2022, 68, 139–148. [Google Scholar] [CrossRef]
  12. Bross, B.; Chen, J.; Ohm, J.R.; Sullivan, G.J.; Wang, Y.K. Developments in international video coding standardization after AVC, with an overview of versatile video coding (VVC). Proc. IEEE 2021, 109, 1463–1493. [Google Scholar] [CrossRef]
  13. Grange, A.; de Rivaz, P.; Hunt, J. VP9 Bitstream & Decoding Process Specification. Available online: https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf (accessed on 29 February 2024).
  14. Sullivan, G.J.; Wiegand, T. Rate-distortion optimization for video compression. IEEE Signal Process. Mag. 1998, 15, 74–90. [Google Scholar] [CrossRef]
  15. Ahmed, N.; Natarajan, T.; Rao, K.R. Discrete cosine transform. IEEE Trans. Comput. 1974, 23, 90–93. [Google Scholar] [CrossRef]
  16. Malvar, H.S.; Hallapuro, A.; Karczewicz, M.; Kerofsky, L. Low-complexity transform and quantization in H.264/AVC. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 598–603. [Google Scholar] [CrossRef]
  17. Ghasempour, M.; Ghanbari, M. A low complexity system for multiple data embedding into H.264 coded video bit-stream. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 4009–4019. [Google Scholar] [CrossRef]
  18. Wallace, G.K. The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 1992, 38, 18–34. [Google Scholar] [CrossRef]
  19. Ding, J.J.; Wang, I.H.; Chen, H.Y. Improved efficiency on adaptive arithmetic coding for data compression using range-adjusting scheme, increasingly adjusting step, and mutual-learning scheme. IEEE Trans. Circuits Syst. Video Technol. 2018, 28, 3412–3423. [Google Scholar] [CrossRef]
  20. Internet Archive. 2023. Available online: https://web.archive.org/web/20230509144046/http://trace.eas.asu.edu/yuv/index.html (accessed on 29 February 2024).
  21. 1920x1080.yuv Images for AVC Codec. 2017. Available online: https://github.com/ireader/avcodec/blob/master/libavo/test/1920x1080.yuv (accessed on 29 February 2024).
  22. Index of Video. 2010. Available online: https://media.xiph.org/video (accessed on 29 February 2024).
  23. Richardson, I.E.G. H. 264 and MPEG-4 Video Compression: Video Coding for Next-Generation Multimedia; John Wiley & Sons: West Sussex, UK, 2004. [Google Scholar]
Figure 1. H.264 intraframe coding flow.
Figure 1. H.264 intraframe coding flow.
Engproc 98 00004 g001
Figure 2. Intra-prediction mode for 4 × 4 and 16 × 6 blocks (for 16 × 16 blocks, only 0–3 are available). Since Mode 2 corresponds to the DC mode, there is no movement and no arrow for Mode 2.
Figure 2. Intra-prediction mode for 4 × 4 and 16 × 6 blocks (for 16 × 16 blocks, only 0–3 are available). Since Mode 2 corresponds to the DC mode, there is no movement and no arrow for Mode 2.
Engproc 98 00004 g002
Figure 3. DC coefficients (left) and DC and AC coefficients (right) of Akiyo.cif with sizes of 72 × 88 and 288 × 352, respectively.
Figure 3. DC coefficients (left) and DC and AC coefficients (right) of Akiyo.cif with sizes of 72 × 88 and 288 × 352, respectively.
Engproc 98 00004 g003
Table 1. Relationship between threshold and suffix length.
Table 1. Relationship between threshold and suffix length.
Current Suffix LengthThreshold for Increasing Suffix Length
00
13
26
312
424
548
6N/A (highest)
Table 2. Parts of level code with corresponding suffix length.
Table 2. Parts of level code with corresponding suffix length.
LevelSuffix Length = 0LevelSuffix Length = 1
11110
−101−111
20012010
−20001−2011
3000013010
−3000001−30011
−70000000000000114000000000000010
± 8 ~ ± 15 000000000000001xxxx−14000000000000011
> ± 16 0000000000000001xxxxxxxxxxxx150000000000000010
−150000000000000011
> ± 15 0000000000000001xxxxxxxxxxxx
xxxx and xxxxxxxxxxxx indicate extra bits.
Table 3. An example of the input of scanning.
Table 3. An example of the input of scanning.
X641
4432
1111
1111
Table 4. Adaptive scanning order in the proposed method.
Table 4. Adaptive scanning order in the proposed method.
X239
14810
571114
6121315
Table 5. Bit length comparison of CAVLC and proposed method for CIF inputs.
Table 5. Bit length comparison of CAVLC and proposed method for CIF inputs.
InputCAVLCProposedReduction
Akiyo177,925171,230−3.18%
bridge-close370,383331,572−10.48%
bridge-far301,698257,717−14.58%
bus399,413362,545−9.23%
foreman275,447251,234−8.79%
flower359,706326,845−9.14%
mother187,798181,465−3.37%
news232,595222,265−4.44%
silent342,727302,813−11.64%
waterfall400,367356,244−11.02%
Average bit reduction−8.58%
Table 6. Bit length comparison of CAVLC and proposed method for QCIF inputs.
Table 6. Bit length comparison of CAVLC and proposed method for QCIF inputs.
InputCAVLCProposedReduction
Akiyo52,93851,761−2.22%
bridge-close103,43291,247−11.78%
bridge-far77,65967,597−12.96%
foreman79,64373,088−8.23%
mother59,21256,002−5.42%
news72,41869,272−4.34%
silent88,89980,225−9.76%
Average bit reduction−7.81%
Table 7. Bit length comparison of CAVLC and the proposed method for FHD inputs.
Table 7. Bit length comparison of CAVLC and the proposed method for FHD inputs.
InputCAVLCProposedReduction
AOV54,711,1014,405,510−6.49%
Time lapse3,030,7562,882,682−4.89%
camera12,979,9622,772,831−6.95%
west wind3,353,7103,270,767−2.47%
rush fields5,436,8805,000,844−8.02%
controlled burn6,905,3016,314,721−8.55%
life5,374,5264,741,603−11.78%
pedestrian3,422,7773,089,042−9.75%
park joy6,513,6385,729,934−12.03%
Average bit reduction−7.88%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hsu, H.-C.; Ding, J.-J. Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Eng. Proc. 2025, 98, 4. https://doi.org/10.3390/engproc2025098004

AMA Style

Hsu H-C, Ding J-J. Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Engineering Proceedings. 2025; 98(1):4. https://doi.org/10.3390/engproc2025098004

Chicago/Turabian Style

Hsu, Huang-Chun, and Jian-Jiun Ding. 2025. "Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding" Engineering Proceedings 98, no. 1: 4. https://doi.org/10.3390/engproc2025098004

APA Style

Hsu, H.-C., & Ding, J.-J. (2025). Adaptive Frequency and Assignment Algorithm for Context-Based Arithmetic Compression Codes for H.264 Video Intraframe Encoding. Engineering Proceedings, 98(1), 4. https://doi.org/10.3390/engproc2025098004

Article Metrics

Back to TopTop