Next Article in Journal
Memristive and Memory Impedance Behavior in a Photo-Annealed ZnO–rGO Thin-Film Device
Next Article in Special Issue
An Autonomous Path Controller in a System on Chip for Shrimp Robot
Previous Article in Journal
An Attribute-Based Collaborative Access Control Scheme Using Blockchain for IoT Devices
Previous Article in Special Issue
DrawerPipe: A Reconfigurable Pipeline for Network Processing on FPGA-Based SmartNIC
 
 
Article

Exploring Efficient Acceleration Architecture for Winograd-Transformed Transposed Convolution of GANs on FPGAs †

by 1,2, 1,2,3,*, 1,2,3, 1,2 and 1,2
1
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
3
Shandong Industrial Institute of Integrated Circuits Technology Ltd, Jinan 250001, China
*
Author to whom correspondence should be addressed.
This paper is an extended version of our paper published in FPT2019: International Conference on Field-Programmable Technology.
Electronics 2020, 9(2), 286; https://doi.org/10.3390/electronics9020286
Received: 8 January 2020 / Revised: 29 January 2020 / Accepted: 30 January 2020 / Published: 7 February 2020
(This article belongs to the Special Issue New Applications and Architectures Based on FPGA/SoC)
The acceleration architecture of transposed convolution layers is essential since transposed convolution operations, as critical components in the generative model of generative adversarial networks, are computationally intensive inherently. In addition, the pre-processing of inserting and padding with zeros for input feature maps causes many ineffective operations. Most of the already known FPGA (Field Programmable Gate Array) based architectures for convolution layers cannot tackle these issues. In this paper, we firstly propose a novel dataflow exploration through splitting the filters and its corresponding input feature maps into four sets and then applying the Winograd algorithm for fast processing with a high efficiency. Secondly, we present an underlying FPGA-based accelerator architecture that features owning processing units, with embedded parallel, pipelined, and buffered processing flow. At last, a parallelism-aware memory partition technique and the hardware-based design space are explored coordinating, respectively, for the required parallel operations and optimal design parameters. Experiments of several state-of-the-art GANs by our methods achieve an average performance of 639.2 GOPS on Xilinx ZCU102 and 162.5 GOPS on Xilinx VC706. In reference to a conventional optimized accelerator baseline, this work demonstrates an 8.6× (up to 11.7×) increase in processing performance, compared to below 2.2× improvement by the prior studies in the literature. View Full-Text
Keywords: generative adversarial networks (GANs); transposed convolution; Winograd; FPGA; acceleration architecture; processing units generative adversarial networks (GANs); transposed convolution; Winograd; FPGA; acceleration architecture; processing units
Show Figures

Figure 1

MDPI and ACS Style

Di, X.; Yang, H.-G.; Jia, Y.; Huang, Z.; Mao, N. Exploring Efficient Acceleration Architecture for Winograd-Transformed Transposed Convolution of GANs on FPGAs. Electronics 2020, 9, 286. https://doi.org/10.3390/electronics9020286

AMA Style

Di X, Yang H-G, Jia Y, Huang Z, Mao N. Exploring Efficient Acceleration Architecture for Winograd-Transformed Transposed Convolution of GANs on FPGAs. Electronics. 2020; 9(2):286. https://doi.org/10.3390/electronics9020286

Chicago/Turabian Style

Di, Xinkai, Hai-Gang Yang, Yiping Jia, Zhihong Huang, and Ning Mao. 2020. "Exploring Efficient Acceleration Architecture for Winograd-Transformed Transposed Convolution of GANs on FPGAs" Electronics 9, no. 2: 286. https://doi.org/10.3390/electronics9020286

Find Other Styles
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop