Next Article in Journal
Operating Wireless Sensor Nodes without Energy Storage: Experimental Results with Transient Computing
Previous Article in Journal
Offshore Measurement System for Wave Power—Using Current Loop Feedback
Article Menu

Export Article

Open AccessArticle
Electronics 2016, 5(4), 88; doi:10.3390/electronics5040088

GPGPU Accelerated Deep Object Classification on a Heterogeneous Mobile Platform

1
Dipartimento di Automatica e Informatica (DAUIN), Politecnico di Torino, Turin 10129, Italy
2
Joint Open Lab, Telecom Italia Mobile (TIM), Turin 10129, Italy
*
Authors to whom correspondence should be addressed.
Academic Editor: Mostafa Bassiouni
Received: 5 September 2016 / Revised: 29 November 2016 / Accepted: 5 December 2016 / Published: 9 December 2016
View Full-Text   |   Download PDF [502 KB, uploaded 9 December 2016]   |  

Abstract

Deep convolutional neural networks achieve state-of-the-art performance in image classification. The computational and memory requirements of such networks are however huge, and that is an issue on embedded devices due to their constraints. Most of this complexity derives from the convolutional layers and in particular from the matrix multiplications they entail. This paper proposes a complete approach to image classification providing common layers used in neural networks. Namely, the proposed approach relies on a heterogeneous CPU-GPU scheme for performing convolutions in the transform domain. The Compute Unified Device Architecture(CUDA)-based implementation of the proposed approach is evaluated over three different image classification networks on a Tegra K1 CPU-GPU mobile processor. Experiments show that the presented heterogeneous scheme boasts a 50× speedup over the CPU-only reference and outperforms a GPU-based reference by 2×, while slashing the power consumption by nearly 30%. View Full-Text
Keywords: machine vision; image analysis; image processing; concurrent computing; neural networks; mobile computing; multicore processing; convolution; ubiquitous computing machine vision; image analysis; image processing; concurrent computing; neural networks; mobile computing; multicore processing; convolution; ubiquitous computing
Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Rizvi, S.T.H.; Cabodi, G.; Patti, D.; Francini, G. GPGPU Accelerated Deep Object Classification on a Heterogeneous Mobile Platform. Electronics 2016, 5, 88.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Electronics EISSN 2079-9292 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top