VLSI Architecture for 8-Point AI-based Arai DCT having Low Area-Time Complexity and Power at Improved Accuracy
Received: 31 December 2011 / Revised: 22 March 2012 / Accepted: 26 March 2012 / Published: 29 March 2012
Cited by 7 | PDF Full-text (151 KB)
A low complexity digital VLSI architecture for the computation of an algebraic integer (AI) based 8-point Arai DCT algorithm is proposed. AI encoding schemes for exact representation of the Arai DCT transform based on a particularly sparse 2-D AI representation is reviewed, leading
[...] Read more.
A low complexity digital VLSI architecture for the computation of an algebraic integer (AI) based 8-point Arai DCT algorithm is proposed. AI encoding schemes for exact representation of the Arai DCT transform based on a particularly sparse 2-D AI representation is reviewed, leading to the proposed novel architecture based on a new final reconstruction step (FRS) having lower complexity and higher accuracy compared to the state-of-the-art. This FRS is based on an optimization derived from expansion factors that leads to small integer constant-coefficient multiplications, which are realized with common sub-expression elimination (CSE) and Booth encoding. The reference circuit  as well as the proposed architectures for two expansion factors α† = 4.5958 and α′ = 167.2309 are implemented. The proposed circuits show 150% and 300% improvements in the number of DCT coefficients having error ≤ 0:1% compared to . The three designs were realized using both 40 nm CMOS Xilinx Virtex-6 FPGAs and synthesized using 65 nm CMOS general purpose standard cells from TSMC. Post synthesis timing analysis of 65 nm CMOS realizations at 900 mV for all three designs of the 8-point DCT core for 8-bit inputs show potential real-time operation at 2.083 GHz clock frequency leading to a combined throughput of 2.083 billion 8-point Arai DCTs per second. The expansion-factor designs show a 43% reduction in area (A) and 29% reduction in dynamic power (PD)
for FPGA realizations. An 11% reduction in area is observed for the ASIC design for α† = 4.5958 for an 8% reduction in total power (PT
). Our second ASIC design having α′ = 167.2309 shows marginal improvements in area and power compared to our reference design but at significantly better accuracy.