Convolutional neural networks (CNNs) are widely adopted in various applications. State-of-the-art CNN models deliver excellent classification performance, but they require a large amount of computation and data exchange because they typically employ many processing layers. Among these processing layers, convolution layers, which carry out many multiplications and additions, account for a major portion of computation and memory access. Therefore, reducing the amount of computation and memory access is the key for high-performance CNNs. In this study, we propose a cost-effective neural network accelerator, named CENNA, whose hardware cost is reduced by employing a cost-centric matrix multiplication that employs both Strassen’s multiplication and a naïve multiplication. Furthermore, the convolution method using the proposed matrix multiplication can minimize data movement by reusing both the feature map and the convolution kernel without any additional control logic. In terms of throughput, power consumption, and silicon area, the efficiency of CENNA is up to 88 times higher than that of conventional designs for the CNN inference.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited