MDPI - Publisher of Open Access Journals

26 pages, 1137 KB

Open AccessArticle

A Novel Low-Complexity and Parallel Algorithm for DCT IV Transform and Its GPU Implementation

by Doru Florin Chiper and Dan Marius Dobrea

Appl. Sci. 2024, 14(17), 7491; https://doi.org/10.3390/app14177491 - 24 Aug 2024

Cited by 3 | Viewed by 2461

This study proposes a novel factorization method for the DCT IV algorithm that allows for breaking it into four or eight sections that can be run in parallel. Moreover, the arithmetic complexity has been significantly reduced. Based on the proposed new algorithm for DCT IV, the speed performance has been improved substantially. The performance of this algorithm was verified using two different GPU systems produced by the NVIDIA company. The experimental results show that the novel proposed DCT algorithm achieves an impressive reduction in the total processing time. The proposed method is very efficient, improving the algorithm speed by more than 4-times—that was expected by segmenting the DCT algorithm into four sections running in parallel. The speed improvements are about five-times higher—at least 5.41 on Jetson AGX Xavier, and 10.11 on Jetson Orin Nano—if we compare with the classical implementation (based on a sequential approach) of DCT IV. Using a parallel formulation with eight sections running in parallel, the improvement in speed performance is even higher, at least 8.08-times on Jetson AGX Xavier and 11.81-times on Jetson Orin Nano. Full article

(This article belongs to the Special Issue Advances in Digital Signal Processing: New Applications and Efficient Implementations)

► Show Figures

Figure 1

17 pages, 1922 KB

Open AccessArticle

An Area-Efficient Unified VLSI Architecture for Type IV DCT/DST Having an Efficient Hardware Security with Low Overheads

by Doru Florin Chiper and Arcadie Cracan

Electronics 2023, 12(21), 4471; https://doi.org/10.3390/electronics12214471 - 30 Oct 2023

Cited by 5 | Viewed by 1427

Abstract

This paper introduces an efficient solution for designing a unified VLSI implementation for type IV DCT/DST while solving one challenging problem in obtaining high performance VLSI chips for common goods, which is solving the security of the hardware while obtaining a VLSI implementation with high performance. The new solution uses a new systolic array algorithm for type IV DST that can allow us to obtain an efficient unified VLSI architecture with one previously designed for type IV DCT. The proposed method uses special arithmetic structures that have been called quasi-cycle convolutions that can be efficiently mapped on linear systolic arrays. Moreover, the obtained unified VLSI architecture, besides being an efficient implementation with a low hardware complexity and high-speed performance, allows for an efficient inclusion of the obfuscation technique with very low overheads. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

17 pages, 2269 KB

Open AccessArticle

An Improved VLSI Algorithm for an Efficient VLSI Implementation of a Type IV DCT That Allows an Efficient Incorporation of Hardware Security with a Low Overhead

by Doru Florin Chiper

Electronics 2023, 12(1), 243; https://doi.org/10.3390/electronics12010243 - 3 Jan 2023

Cited by 3 | Viewed by 2507

Abstract

This paper aims to solve one of the most challenging problems in designing VLSI chips for common goods, namely an efficient incorporation of security techniques while maintaining high performances of the VLSI implementation with a reduced hardware complexity. In this case, it is very important to maintain high performance at a low hardware complexity and the overheads introduced by the security techniques should be as low as possible. This paper proposes an improved approach based on a new VLSI algorithm for including the obfuscation technique in the VLSI implementation of one important DSP algorithm used in multimedia applications. The proposed approach is based on a new VLSI algorithm that decomposes type IV DCT into six quasi-cycle convolutions and allows an efficient incorporation of the obfuscation technique. The proposed method uses a regular and modular structure called quasi-cyclic convolution and the obtained architecture is based on the architectural paradigm of systolic arrays. In this way we can obtain the advantages introduced by systolic arrays, especially high speed, with an efficient utilization of the hardware structure. Moreover, using the proposed VLSI algorithm, we can obtain the important benefit of attaining hardware security. Thus, a more efficient VLSI architecture for type IV DCT can be obtained, with a significant reduction of the hardware complexity, and an efficient incorporation of an improved hardware security mechanism with low overheads. These features are very important for resource-constrained common goods. Full article

(This article belongs to the Special Issue Efficient Algorithms and Architectures for DSP Applications)

► Show Figures

Figure 1

23 pages, 4845 KB

Open AccessArticle

A New Approach for a Unified Architecture for Type IV DCT/DST with an Efficient Incorporation of Obfuscation Technique

by Doru Florin Chiper and Laura-Teodora Cotorobai

Electronics 2021, 10(14), 1656; https://doi.org/10.3390/electronics10141656 - 12 Jul 2021

Cited by 13 | Viewed by 2437

Abstract

This paper aims at solving one challenging problem in designing VLSI chips, namely, the security of the hardware, by presenting a new design approach that incorporates the obfuscation technique in the VLSI implementation of some important DSP algorithms. The proposed method introduces a new approach in obtaining a unified VLSI architecture for computing type IV discrete cosine transform (DCT-IV) and type IV discrete sine transform (DST-IV), with an efficient integration of the obfuscation technique, while maintaining low overheads. The algorithms for these two transforms were restructured in such a way that their structures are fairly similar, and thus they can be implemented on the same VLSI chip and on the same hardware with very few modifications, with the latter being attributed to the pre-processing and post-processing stages. The design proposed uses the regular and modular structures, which are named quasi-correlation, and the architecture is inspired by the paradigm of the systolic array architecture. Thus, the introduced design benefits the security, for the hardware, and also the advantages introduced by the use of the regular and modular structures. A very efficient, unified VLSI architecture for type IV DCT/DST can be obtained, which allows the computation of the two algorithms on the same hardware, allowing an efficient incorporation of the obfuscation technique with very low overheads, and it can be very efficiently implemented, offering high-speed performances and low hardware complexity, with the latter being attributed to the efficient use of the hardware resources for the computation of these two algorithms. Full article

(This article belongs to the Special Issue Efficient Algorithms and Architectures for DSP Applications)

► Show Figures

Figure 1

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (4)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI