Next Article in Journal
Efficient Systolic-Array Redundancy Architecture for Offline/Online Repair
Previous Article in Journal
Reconfigurable Antennas: Switching Techniques—A Survey
Previous Article in Special Issue
Open Vision System for Low-Cost Robotics Education
Open AccessFeature PaperArticle

Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search

DAUIN, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, Italy
*
Author to whom correspondence should be addressed.
Electronics 2020, 9(2), 337; https://doi.org/10.3390/electronics9020337
Received: 17 January 2020 / Revised: 6 February 2020 / Accepted: 12 February 2020 / Published: 15 February 2020
(This article belongs to the Special Issue Advanced Embedded HW/SW Development)
Sequence-to-sequence deep neural networks have become the state of the art for a variety of machine learning applications, ranging from neural machine translation (NMT) to speech recognition. Many mobile and Internet of Things (IoT) applications would benefit from the ability of performing sequence-to-sequence inference directly in embedded devices, thereby reducing the amount of raw data transmitted to the cloud, and obtaining benefits in terms of response latency, energy consumption and security. However, due to the high computational complexity of these models, specific optimization techniques are needed to achieve acceptable performance and energy consumption on single-core embedded processors. In this paper, we present a new optimization technique called dynamic beam search, in which the inference complexity is tuned to the difficulty of the processed input sequence at runtime. Results based on measurements on a real embedded device, and on three state-of-the-art deep learning models, show that our method is able to reduce the inference time and energy by up to 25% without loss of accuracy. View Full-Text
Keywords: recurrent neural networks; edge computing; energy efficiency recurrent neural networks; edge computing; energy efficiency
Show Figures

Figure 1

MDPI and ACS Style

Jahier Pagliari, D.; Daghero, F.; Poncino, M. Sequence-To-Sequence Neural Networks Inference on Embedded Processors Using Dynamic Beam Search. Electronics 2020, 9, 337.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop