Object detection is an essential component of many systems used, for example, in advanced driver assistance systems (ADAS) or advanced video surveillance systems (AVSS). Currently, the highest detection accuracy is achieved by solutions using deep convolutional neural networks (DCNN). Unfortunately, these come at
[...] Read more.
Object detection is an essential component of many systems used, for example, in advanced driver assistance systems (ADAS) or advanced video surveillance systems (AVSS). Currently, the highest detection accuracy is achieved by solutions using deep convolutional neural networks (DCNN). Unfortunately, these come at the cost of a high computational complexity; hence, the work on the widely understood acceleration of these algorithms is very important and timely. In this work, we compare three different DCNN hardware accelerator implementation methods: coarse-grained (a custom accelerator called LittleNet), fine-grained (
FINN) and sequential (
Vitis AI). We evaluate the approaches in terms of object detection accuracy, throughput and energy usage on the
VOT and
VTB datasets. We also present the limitations of each of the methods considered. We describe the whole process of
DNNs implementation, including architecture design, training, quantisation and hardware implementation. We used two custom DNN architectures to obtain a higher accuracy, higher throughput and lower energy consumption. The first was implemented in SystemVerilog and the second with the
FINN tool from AMD Xilinx. Next, both approaches were compared with the
Vitis AI tool from AMD Xilinx. The final implementations were tested on the
Avnet Ultra96-V2 development board with the Zynq UltraScale+ MPSoC ZCU3EG device. For two different DNNs architectures, we achieved a throughput of 196 fps for our custom accelerator and 111 fps for
FINN. The same networks implemented with
Vitis AI achieved 123.3 fps and 53.3 fps, respectively.
Full article