MDPI - Publisher of Open Access Journals

17 pages, 595 KiB

Open AccessFeature PaperArticle

by Alexander Magyari and Yuhua Chen

Electronics 2025, 14(3), 550; https://doi.org/10.3390/electronics14030550 - 29 Jan 2025

Viewed by 1063

We introduce a modular reduction method that is optimized for hardware and outperforms conventional approaches. By leveraging calculated reduction cycles and combinatorial logic, we achieve a remarkable 30% reduction in power usage, 27% reduction in Configurable Logic Blocks (CLBs), and 42% fewer look-up [...] Read more.

We introduce a modular reduction method that is optimized for hardware and outperforms conventional approaches. By leveraging calculated reduction cycles and combinatorial logic, we achieve a remarkable 30% reduction in power usage, 27% reduction in Configurable Logic Blocks (CLBs), and 42% fewer look-up tables (LUTs) than the conventional implementation. Our Hardware-Optimized Modular Reduction (HOM-R) system can condense a 256-bit input to a four-bit base within a single 250 MHz clock cycle. Further, our method stands out from prevalent techniques, such as Barrett and Montgomery reduction, by eliminating the need for multipliers or dividers, and relying solely on addition and customizable LUTs. This innovative method frees up FPGA resources typically consumed by power-intensive DSPs, offering a compelling low-power, low-latency alternative for diverse design needs. Full article

(This article belongs to the Special Issue Emerging Applications of FPGAs and Reconfigurable Computing System)

► Show Figures

Figure 1

33 pages, 4585 KiB

Open AccessArticle

Optimal Implementation of Tapped Delay Line Time-to-Digital Converters in 20 nm Xilinx UltraScale FPGAs

by Mattia Morabito, Nicola Lusardi, Fabio Garzetti, Gabriele Fiumicelli, Gabriele Bonanno, Enrico Ronconi, Andrea Costa and Angelo Geraci

Electronics 2024, 13(24), 4888; https://doi.org/10.3390/electronics13244888 - 11 Dec 2024

Viewed by 2291

Abstract

This study investigated implementation strategies to optimize the precision of Tapped Delay Line (TDL) Time-to-Digital Converters (TDCs) designed for Xilinx 20 nm UltraScale Field-Programmable Gate Arrays (FPGAs). This optimization process aims to bridge the performance gap between FPGA-based TDCs, which are more flexible [...] Read more.

This study investigated implementation strategies to optimize the precision of Tapped Delay Line (TDL) Time-to-Digital Converters (TDCs) designed for Xilinx 20 nm UltraScale Field-Programmable Gate Arrays (FPGAs). This optimization process aims to bridge the performance gap between FPGA-based TDCs, which are more flexible and suitable for fast prototyping, and the better-performing Application-Specific Integrated Circuit (ASIC) solutions, making FPGA-based TDCs viable for cutting-edge applications. Our key areas of focus included the optimal design of the decoder, the degree of sub-interpolation, and the placement of TDLs, with particular emphasis on the clocking distribution scheme within the Configurable Logic Block (CLB) to minimize the effects of Bubble Errors (BEs) and quantization error. The research led to the development and comparison of multiple TDL TDC solutions implemented on a Kintex UltraScale device (i.e., XCKU040-2FFVA1156E) housed on a KCU105 general-purpose Evaluation Board (EVB). From these, two main solutions emerged: one with high precision and one with low area. The first one was characterized by a Single-Shot Precision (SSP) of 2.64 ps r.m.s., and by Differential and Integral Non-Linearity (DNL/INL) Errors of 0.523 ps and 16.939 ps, respectively, occupying 883 CLBs and 126 kb of Block RAM (BRAM). The second one had an SSP of 3.75 ps r.m.s., a DNL of 0.599 ps, and an INL of 7.151 ps, and it occupies only 259 CLBs and 72 kb of BRAM. Full article

(This article belongs to the Special Issue System-on-Chip (SoC) and Field-Programmable Gate Array (FPGA) Design)

► Show Figures

Figure 1

33 pages, 2291 KiB

Open AccessArticle

Hardware-Efficient Configurable Ring-Oscillator-Based Physical Unclonable Function/True Random Number Generator Module for Secure Key Management

by Santiago Sánchez-Solano, Luis F. Rojas-Muñoz, Macarena C. Martínez-Rodríguez and Piedad Brox

Sensors 2024, 24(17), 5674; https://doi.org/10.3390/s24175674 - 31 Aug 2024

Cited by 3 | Viewed by 2002

Abstract

The use of physical unclonable functions (PUFs) linked to the manufacturing process of the electronic devices supporting applications that exchange critical data over the Internet has made these elements essential to guarantee the authenticity of said devices, as well as the confidentiality and [...] Read more.

The use of physical unclonable functions (PUFs) linked to the manufacturing process of the electronic devices supporting applications that exchange critical data over the Internet has made these elements essential to guarantee the authenticity of said devices, as well as the confidentiality and integrity of the information they process or transmit. This paper describes the development of a configurable PUF/TRNG module based on ring oscillators (ROs) that takes full advantage of the structure of modern programmable devices offered by Xilinx 7 Series families. The proposed architecture improves the hardware efficiency with two main objectives. On the one hand, we perform an exhaustive statistical characterization of the results derived from the exploitation of RO configurability. On the other hand, we undertake the development of a new version of the module that requires a smaller amount of resources while considerably increasing the number of output bits compared to other proposals previously reported in the literature. The design as a highly parameterized intellectual property (IP) module connectable through a standard interface to a soft- or hard-core general-purpose processor greatly facilitates its integration into embedded solutions while accelerating the validation and characterization of this element on the same electronic device that implements it. The studies carried out reveal adequate values of reliability, uniqueness, and unpredictability when the module acts as a PUF, as well as acceptable levels of randomness and entropy when it acts as a true random number generator (TRNG). They also illustrate the ability to obfuscate and recover identifiers or cryptographic keys of up to 4096 bits using an implementation of the PUF/TRNG module that requires only an array of

4 \times 4

configurable logic blocks (CLBs) to accommodate the RO bank. Full article

(This article belongs to the Collection Cryptography and Security in IoT and Sensor Networks)

► Show Figures

Figure 1

16 pages, 3977 KiB

Open AccessFeature PaperArticle

Energy Efficient CLB Design Based on Adiabatic Logic for IoT Applications

by Wu Yang, Milad Tanavardi Nasab and Himanshu Thapliyal

Electronics 2024, 13(7), 1309; https://doi.org/10.3390/electronics13071309 - 31 Mar 2024

Cited by 3 | Viewed by 1978

Abstract

Many IoT applications require high computational performance and flexibility, and FPGA is a promising candidate. However, increased computation power results in higher energy dissipation, and energy efficiency is one of the key concerns for IoT applications. In this paper, we explore adiabatic logic [...] Read more.

Many IoT applications require high computational performance and flexibility, and FPGA is a promising candidate. However, increased computation power results in higher energy dissipation, and energy efficiency is one of the key concerns for IoT applications. In this paper, we explore adiabatic logic for designing an energy efficient configurable logic block (CLB) and compare it to the CMOS counterpart. The simulation results show that the proposed adiabatic-logic-based look-up table (LUT) has significant energy savings for the frequency range of 1 MHz to 40 MHz, and the least energy savings is at 40 MHz, which is 92.94% energy reduction compared to its CMOS counterpart. Further, the three proposed adiabatic-logic-based memory cells are 14T, 16T, and 12T designs with at least 88.2%, 84.2%, and 87.2% energy savings. Also, we evaluated the performance of the proposed CLBs using an adiabatic-logic-based LUT (AL-LUT) interfacing with adiabatic-logic-based memory cells. The proposed design shows significant energy reduction compared to a CMOS LUT interface with SRAM cells for different frequencies; the energy savings are at least 91.6% for AL-LUT 14T, 89.7% for AL-LUT 16T, and 91.3% AL-LUT 12T. Full article

(This article belongs to the Special Issue The Future of IoT: Advanced AI Based IoT Technologies and Applications)

► Show Figures

Figure 1

16 pages, 5555 KiB

Open AccessArticle

Implementation of EnDat Interface Master Using Configurable Logic Block in MCU

by Kyungah Kim, Duc M. Tran and Joon-Young Choi

Electronics 2024, 13(6), 1101; https://doi.org/10.3390/electronics13061101 - 17 Mar 2024

Cited by 1 | Viewed by 2205

Abstract

In this study, we propose an implementation method of the Encoder Data (EnDat) interface master for slave encoders using only a configurable logic block (CLB) and a serial peripheral interface (SPI) integrated into microcontroller units. By programming the CLB device to execute logic [...] Read more.

In this study, we propose an implementation method of the Encoder Data (EnDat) interface master for slave encoders using only a configurable logic block (CLB) and a serial peripheral interface (SPI) integrated into microcontroller units. By programming the CLB device to execute logic functions and finite state machines designed for the EnDat interface master operation, we realize the EnDat and SPI clocks that are required for the EnDat interface master operation. This approach is cost-efficient because additional hardware components, such as a field-programmable gate array or a complex programmable logic device, are unnecessary for the master implementation. We build a one-axis feed drive system that is powered by an AC motor and equipped with an EnDat linear encoder for measuring table speed and position. By performing various experiments for table position and speed control based on the built feed drive system, we verify the performance and practical usefulness of the implemented EnDat interface master. The maximum EnDat clock frequency without the propagation delay compensation is achieved by 2 MHz, which can cope with 16 kHz control cycle frequency. The usefulness is demonstrated by showing the table speed and position control performance that are acceptable in real applications. Full article

(This article belongs to the Special Issue Design and Development of Digital Embedded Systems)

► Show Figures

Figure 1

15 pages, 18651 KiB

Open AccessArticle

CLB-Based Development of BiSS-C Interface Master for Motor Encoders

by Duc M. Tran, Kyungah Kim and Joon-Young Choi

Electronics 2023, 12(4), 886; https://doi.org/10.3390/electronics12040886 - 9 Feb 2023

Cited by 3 | Viewed by 4730

Abstract

Encoder interfaces should be operated in real time with high precision and fast processing for industrial motor control systems. The continuous bidirectional serial synchronous (BiSS-C) interface is an open-source serial communication protocol designed for motor encoders and is suitable for industrial purposes because [...] Read more.

Encoder interfaces should be operated in real time with high precision and fast processing for industrial motor control systems. The continuous bidirectional serial synchronous (BiSS-C) interface is an open-source serial communication protocol designed for motor encoders and is suitable for industrial purposes because of its fast serial communication speed. In this study, we propose a method for developing a BiSS-C interface master for a motor encoder slave, using only the configurable logic block (CLB) peripheral integrated into TI microcontroller units. By analyzing the detailed operation protocol of the BiSS-C interface, we create the truth and state tables for logic circuits and finite state machines, which are required for the BiSS-C interface master. Then, by programming the CLB based on the created truth and state tables, we implement the master clock, serial peripheral interface (SPI) clock, and operational process for the master. This approach is cost-efficient because additional hardware components, such as a field-programmable gate array or a complex programmable logic device, are not required for the master implementations. The developed method can be immediately applied to developing the masters for other BiSS-C encoders with different specifications, which is certainly necessary for a motor drive development and test. By building an AC motor control system with the developed master and performing various experiments, we verify the performance and practical usefulness of the developed BiSS-C interface master. The maximum master clock frequency without any CRC errors is achieved by 6.25 MHz, which can cope with more than 20 kHz motor control cycle frequency. The usefulness is demonstrated by showing the motor speed and position control performance that are acceptable in real applications. Full article

(This article belongs to the Special Issue Real-Time Digital Control Technologies and Applications)

► Show Figures

Figure 1

13 pages, 5516 KiB

Open AccessArticle

RETRACTED: Express Data Processing on FPGA: Network Interface Cards for Streamlined Software Inspection for Packet Processing

by Sunkari Pradeep, Yogesh Kumar Sharma, Chaman Verma, Gutha Sreeram and Panugati Hanumantha Rao

Appl. Syst. Innov. 2023, 6(1), 9; https://doi.org/10.3390/asi6010009 - 9 Jan 2023

Cited by 1 | Viewed by 3468 | Retraction

Abstract

Modern computers’ network interface cards (NICs) are undergoing changes in order to handle greater data rates and assist with scaling problems caused by general-purpose CPU technology. The inclusion of programmable accelerators to the NIC’s data channel is one of the ongoing improvements that [...] Read more.

Modern computers’ network interface cards (NICs) are undergoing changes in order to handle greater data rates and assist with scaling problems caused by general-purpose CPU technology. The inclusion of programmable accelerators to the NIC’s data channel is one of the ongoing improvements that is particularly intriguing since it gives the accelerator the chance to take on a portion of the CPU’s network packet processing duties. Accelerators are frequently developed using platforms like field-programmable gate arrays because packet processing operations have severe latency requirements (FPGAs). When implementing packet processing activities, FPGAs’ gain for through put is the number of data packets being successfully sent per second and latency is the actual time those packets take. However, due to their restricted resources, programming may need to be shared throughout a variety of applications. We provide hXDP, a software solution for FPGAs that targets the Linux eXpress Data Path and performs packet processing functions outlined with the eBPF technology. While maintaining performance on par with top-tier CPUs, hXDP only uses a tiny portion from the field programmable gate arrays, which are semiconductor devices that are based around a matrix of configuration logic blocks (CLB) connected over programmable interconnects. However, we demonstrate that when aiming towards a purpose-built FPGA architecture, many extended Berkeley packet filters (eBPF) allow programmers to use Berkeley packet filter byte code that makes use of certain kernel resources and instruction set architecture, to collocate and even eliminate, with considerably productivity and effectiveness. On an FPGA NIC, we implement hXDP and test its effectiveness using authentic eBPF programmes from the real world. Our version consumes 15% of the FPGA resources and operates at 156.25 MHz. This can constantly change and lead to the act of identification, inspection, extraction, and manipulation so that a network may make more intelligent management decisions. Full article

(This article belongs to the Special Issue Applied System Innovations Using Recent Communications and Networking Advances)

► Show Figures

Figure 1

19 pages, 885 KiB

Open AccessArticle

A Novel Ultra-Compact FPGA PUF: The DD-PUF

by Riccardo Della Sala, Davide Bellizia and Giuseppe Scotti

Cryptography 2021, 5(3), 23; https://doi.org/10.3390/cryptography5030023 - 8 Sep 2021

Cited by 31 | Viewed by 5323

Abstract

In this paper, we present a novel ultra-compact Physical Unclonable Function (PUF) architecture and its FPGA implementation. The proposed Delay Difference PUF (DD-PUF) is the most dense FPGA-compatible PUF ever reported in the literature, allowing the implementation of two PUF bits in a [...] Read more.

In this paper, we present a novel ultra-compact Physical Unclonable Function (PUF) architecture and its FPGA implementation. The proposed Delay Difference PUF (DD-PUF) is the most dense FPGA-compatible PUF ever reported in the literature, allowing the implementation of two PUF bits in a single slice and provides very good values for all the most important figures of merit. The architecture of the proposed PUF exploits the delay difference between two nominally identical signal paths and the metastability features of D-Latches with an asynchronous reset input. The DD-PUF has been implemented on both Xilinx Spartan-6 and Artix-7 devices and the resulting design flows which allow to accurately balance the nominal delay of the different signal paths is outlined. The circuits have been extensively tested under temperature and supply voltage variations and the results of our evaluations on both FPGA families have shown that the proposed architecture and implementation are able to fit in just 32 Configurable Logic Blocks (CLBs) without sacrificing steadiness, uniqueness and uniformity, thus outperforming most of the previously published FPGA-compatible PUFs. Full article

(This article belongs to the Special Issue Implementation and Verification of Secure Hardware against Physical Attacks)

► Show Figures

Figure 1

18 pages, 985 KiB

Open AccessEditor’s ChoiceArticle

Congestion Prediction in FPGA Using Regression Based Learning Methods

by Pingakshya Goswami and Dinesh Bhatia

Electronics 2021, 10(16), 1995; https://doi.org/10.3390/electronics10161995 - 18 Aug 2021

Cited by 13 | Viewed by 4078

Abstract

Design closure in general VLSI physical design flows and FPGA physical design flows is an important and time-consuming problem. Routing itself can consume as much as 70% of the total design time. Accurate congestion estimation during the early stages of the design flow [...] Read more.

Design closure in general VLSI physical design flows and FPGA physical design flows is an important and time-consuming problem. Routing itself can consume as much as 70% of the total design time. Accurate congestion estimation during the early stages of the design flow can help alleviate last-minute routing-related surprises. This paper has described a methodology for a post-placement, machine learning-based routing congestion prediction model for FPGAs. Routing congestion is modeled as a regression problem. We have described the methods for generating training data, feature extractions, training, regression models, validation, and deployment approaches. We have tested our prediction model by using ISPD 2016 FPGA benchmarks. Our prediction method reports a very accurate localized congestion value in each channel around a configurable logic block (CLB). The localized congestion is predicted in both vertical and horizontal directions. We demonstrate the effectiveness of our model on completely unseen designs that are not initially part of the training data set. The generated results show significant improvement in terms of accuracy measured as mean absolute error and prediction time when compared against the latest state-of-the-art works. Full article

(This article belongs to the Special Issue Advanced AI Hardware Designs Based on FPGAs)

► Show Figures

Figure 1

30 pages, 11651 KiB

Open AccessFeature PaperEditor’s ChoiceArticle

10 Clock-Periods Pipelined Implementation of AES-128 Encryption-Decryption Algorithm up to 28 Gbit/s Real Throughput by Xilinx Zynq UltraScale+ MPSoC ZCU102 Platform

by Paolo Visconti, Stefano Capoccia, Eugenio Venere, Ramiro Velázquez and Roberto de Fazio

Electronics 2020, 9(10), 1665; https://doi.org/10.3390/electronics9101665 - 13 Oct 2020

Cited by 12 | Viewed by 8636

Abstract

The security of communication and computer systems is an increasingly important issue, nowadays pervading all areas of human activity (e.g., credit cards, website encryption, medical data, etc.). Furthermore, the development of high-speed and light-weight implementations of the encryption algorithms is fundamental to improve [...] Read more.

The security of communication and computer systems is an increasingly important issue, nowadays pervading all areas of human activity (e.g., credit cards, website encryption, medical data, etc.). Furthermore, the development of high-speed and light-weight implementations of the encryption algorithms is fundamental to improve and widespread their application in low-cost, low-power and portable systems. In this scientific article, a high-speed implementation of the AES-128 algorithm is reported, developed for a short-range and high-frequency communication system, called Wireless Connector; a Xilinx ZCU102 Field Programmable Gate Array (FPGA) platform represents the core of this communication system since manages all the base-band operations, including the encryption/decryption of the data packets. Specifically, a pipelined implementation of the Advanced Encryption Standard (AES) algorithm has been developed, allowing simultaneous processing of distinct rounds on multiple successive plaintext packets for each clock period and thus obtaining higher data throughput. The proposed encryption system supports 220 MHz maximum operating frequency, ensuring encryption and decryption times both equal to only 10 clock periods. Thanks to the pipelined approach and optimized solutions for the Substitute Bytes operation, the proposed implementation can process and provide the encrypted packets each clock period, thus obtaining a maximum data throughput higher than 28 Gbit/s. Also, the simulation results demonstrate that the proposed architecture is very efficient in using hardware resources, requiring only 1631 Configurable Logic Blocks (CLBs) for the encryption block and 3464 CLBs for the decryption one. Full article

(This article belongs to the Special Issue Emerging Applications of Recent FPGA Architectures)

► Show Figures

Figure 1

Search Results (10)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (10)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI