# An FPGA Implementation of a Convolutional Auto-Encoder

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. FPGA Implementation of the CAE

#### 2.1. Hardware Framework

#### 2.2. Design of Convolutional Encoding Module

#### 2.3. Design of PE

Algorithm 1. The algorithm of convolution operation process in PE. |

Initialization: Reset the system and cache the parameters. Input: The eight channels image data: data_in, are inputted. Shift: data_shift [31:0] = {data_shift [15:0], data_in}; Convolution: for(row = 0, row ≤ 1, row++){ for(col = 0, col ≤ 1, col ++){ for(p = 0, p ≤ 2, p++) { for(q = 0, q ≤ 2, q++){ data_convolution [row][col]+ = kernel[p] [q] × data_shift [row + p] [col + q];}}} Activation: Execute the ReLU function. Pooling: Compare and then give the max value. (0 ≤ {row,col} ≤ 1) Return: Output the pooling result. |

## 3. Experimental Results and Discussion

_{I}represents the maximum value of the image color. MSE is the mean square error between the original image and the reconstructed image. MSE is expressed as shown below:

## 4. Conclusions

## Author Contributions

## Conflicts of Interest

## References

- Iqbal, M.S. Unsupervised Multi-modal Learning. In Advances in Artificial Intelligence; Lecture Notes in Artificial Intelligence; Springer: Berlin, Germany, 2015; Volume 9091, pp. 343–346. [Google Scholar]
- Nugent, A.; Kenyon, G.; Porter, R. Unsupervised adaptation to improve fault tolerance of neural network classifiers. In Proceedings of the 2004 NASA/DoD Conference on Evolvable Hardware, Seattle, WA, USA, 26 June 2004; IEEE Computer Society: Los Alamitos, CA, USA, 2004; pp. 146–149. [Google Scholar]
- Guest, E. Face image analysis by unsupervised learning. Trends Cogn. Sci.
**2002**, 6, 145. (In English) [Google Scholar] [CrossRef] - Guan, Y.; Ghorbani, A.A.; Belacel, N. An unsupervised clustering algorithm for intrusion detection. In Advances in Artificial Intelligence, Proceedings; Lecture Notes in Artificial Intelligence; Springer: Berlin, Germany, 2003; Volume 2671, pp. 616–617. [Google Scholar]
- Smith, T. Unsupervised neural networks-disruptive technology for seismic interpretation. Oil Gas J.
**2010**, 108, 42–47. (In English) [Google Scholar] - Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature
**1986**, 323, 533–536. [Google Scholar] [CrossRef] - Hinton, G.E.; Salakhutdinov, R.R. Reducing the Dimensionality of Data with Neural Networks. Science
**2006**, 313, 504–507. [Google Scholar] [CrossRef] [PubMed] - Nam, S.; Park, H.; Seo, C.; Choi, D. Forged Signature Distinction Using Convolutional Neural Network for Feature Extraction. Appl. Sci.
**2018**, 8, 153. [Google Scholar] [CrossRef] - Prakash, A.; Moran, N.; Garber, S.; DiLillo, A.; Storer, J. Semantic Perceptual Image Compression using Deep Convolution Networks. In Proceedings of the 2017 Data Compression Conference, Snowbird, UT, USA, 4–7 April 2017; Bilgin, A., Marcellin, M.W., SerraSagrista, J., Storer, J.A., Eds.; IEEE Computer Society: Los Alamitos, CA, USA, 2017. [Google Scholar]
- Hu, J.; Xu, X.B.; Pan, X.F.; Liu, L.M. Optimization and Implementation of Image Compression Algorithm Based on Neural Network. In Proceedings of the 2016 6th International Conference on Applied Science, Engineering and Technology, Qingdao, China, 29–30 May 2016; AER-Advances in Engineering Research. Atlantis Press: Paris, France, 2016; Volume 77, pp. 130–136. [Google Scholar]
- Jiang, F.; Tao, W.; Liu, S.; Ren, J.; Guo, X.; Zhao, D. An End-to-End Compression Framework Based on Convolutional Neural Networks. IEEE Trans. Circuits Syst. Video Technol.
**2017**. [Google Scholar] [CrossRef] - Kwon, S.-K.; Jung, H.-S.; Baek, W.-K.; Kim, D. Classification of Forest Vertical Structure in South Korea from Aerial Orthophoto and Lidar Data Using an Artificial Neural Network. Appl. Sci.
**2017**, 7, 1046. [Google Scholar] [CrossRef] - Gao, F.; Huang, T.; Wang, J.; Sun, J.; Hussain, A.; Yang, E. Dual-Branch Deep Convolution Neural Network for Polarimetric SAR Image Classification. Appl. Sci.
**2017**, 7, 447. [Google Scholar] [CrossRef] - Nishio, M.; Nagashima, C.; Hirabayashi, S.; Ohnishi, A.; Sasaki, K.; Sagawa, T.; Hamada, M.; Yamashita, T. Convolutional auto-encoder for image denoising of ultra-low-dose CT. Heliyon
**2017**, 3, e00393. (In English) [Google Scholar] [CrossRef] [PubMed] - Xie, R.; Wen, J.; Quitadamo, A.; Cheng, J.L.; Shi, X.H. A deep auto-encoder model for gene expression prediction. BMC Genom.
**2017**, 18, 11. (In English) [Google Scholar] [CrossRef] [PubMed] - Wang, Y.Q.; Xie, Z.G.; Xu, K.; Dou, Y.; Lei, Y.W. An efficient and effective convolutional auto-encoder extreme learning machine network for 3D feature learning. Neurocomputing
**2016**, 174, 988–998. (In English) [Google Scholar] [CrossRef] - Vilovic, I. An Experience in Image Compression Using Neural Networks. In Proceedings of the 48th International Symposium ELMAR-2006 Focused on Multimedia Signal Processing and Communications, Zadar, Croatia, 7–10 June 2006; pp. 95–98. [Google Scholar]
- Kim, J.; Song, S.; Yu, S.C. Denoising Auto-Encoder Based Image Enhancement For High Resolution Sonar Image. In Proceedings of the 2017 IEEE Underwater Technology (UT), Busan, Korea, 21–24 February 2017; IEEE: New York, NY, USA, 2017. [Google Scholar]
- Qu, J.Y.; Wang, R.B. Collective behavior of large-scale neural networks with GPU acceleration. Cogn. Neurodyn.
**2017**, 11, 553–563. (In English) [Google Scholar] [CrossRef] [PubMed] - Rizvi, S.T.H.; Cabodi, G.; Francini, G. Optimized Deep Neural Networks for Real-Time Object Classification on Embedded GPUs. Appl. Sci.
**2017**, 7, 19. [Google Scholar] [CrossRef] - Wang, X.; Zhu, Y.; Huang, L. A comprehensive reconfigurable computing approach to memory wall problem of large graph computation. J. Syst. Archit.
**2016**, 70, 59–69. [Google Scholar] [CrossRef] - Chen, Y.H.; Krishna, T.; Emer, J.S.; Sze, V. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. IEEE J. Solid State Circuit
**2017**, 52, 127–138. (In English) [Google Scholar] [CrossRef] - Zhou, Y.M.; Jiang, J.F.; IEEE. An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks. In Proceedings of the 2015 4th International Conference on Computer Science and Network Technology, Harbin, China, 19–20 December 2015; IEEE: New York, NY, USA, 2015; pp. 829–832. [Google Scholar]
- Mansour, W.; Ayoubi, R.; Ziade, H.; Velazco, R.; Falou, W.E. An optimal implementation on FPGA of a hopfield neural network. Adv. Artif. Neural Syst.
**2011**, 2011. [Google Scholar] [CrossRef] - Liu, Z.Q.; Dou, Y.; Jiang, J.F.; Xu, J.W.; Li, S.J.; Zhou, Y.M.; Xu, Y.N. Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks. ACM Trans. Reconfigurable Technol. Syst.
**2017**, 10, 23. (In English) [Google Scholar] [CrossRef] - Li, Y.; Ye, X.; Li, Y. Image quality assessment using deep convolutional networks. AIP Adv.
**2017**, 7, 125324. [Google Scholar] [CrossRef] - Wang, Z.; Simoncelli, E.; Bovik, A.C. Multi-Scale Structural Similarity for Image Quality Assessment. In Proceedings of the Conference Record of the Thirty-Seventh Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA, USA, 9–12 November 2004. [Google Scholar]

**Figure 2.**The encoder of the proposed CAE hardware framework. The different colors marked in the zero padding module indicate different filling methods: the first and the last row are filled all zeroes, while the intermediate matrix is padded zeroes only at the beginning and the end of each row. The different colors marked in the channel arbitrator module represent the corresponding four channels for packaging.

**Figure 4.**The 4 × 4 convolution window (the red dotted box) contains four 3 × 3 sub-windows: Slide_1, Slide_2, Slide_3, and Slide_4, which are marked with different colors or streaks, respectively.

Resource | Utilization | Available | Utilization% |
---|---|---|---|

LUT | 22,441 | 242,400 | 9.26 |

FF | 27,100 | 484,800 | 5.59 |

BRAM | 55.50 | 600 | 9.25 |

IO | 120 | 520 | 23.08 |

BUFG | 8 | 480 | 1.67 |

MMCM | 2 | 10 | 20 |

Performances | FPGA | CPU | GPU |
---|---|---|---|

Computing time (ms) | 15.73 | 115.29 | 13.65 |

Power consumption (W) | 2.762 | 32 | 44 |

Computing performance (GOPS) | 73.18 | 3.25 | 76.38 |

Computing performance per Watt (GOPS/W) | 26.49 | 0.101 | 1.73 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Zhao, W.; Jia, Z.; Wei, X.; Wang, H. An FPGA Implementation of a Convolutional Auto-Encoder. *Appl. Sci.* **2018**, *8*, 504.
https://doi.org/10.3390/app8040504

**AMA Style**

Zhao W, Jia Z, Wei X, Wang H. An FPGA Implementation of a Convolutional Auto-Encoder. *Applied Sciences*. 2018; 8(4):504.
https://doi.org/10.3390/app8040504

**Chicago/Turabian Style**

Zhao, Wei, Zuchen Jia, Xiaosong Wei, and Hai Wang. 2018. "An FPGA Implementation of a Convolutional Auto-Encoder" *Applied Sciences* 8, no. 4: 504.
https://doi.org/10.3390/app8040504