research-article

Adaptive deep learning model selection on embedded systems

Authors:
Ben Taylor

Lancaster University, UK

Lancaster University, UK
View Profile

,
Vicent Sanz Marco

Lancaster University, UK

Lancaster University, UK
View Profile

,
Willy Wolff

Lancaster University, UK

Lancaster University, UK
View Profile

,
Yehia Elkhatib

Lancaster University, UK

Lancaster University, UK
View Profile

,
Zheng Wang

Lancaster University, UK

Lancaster University, UK
View Profile

LCTES 2018: Proceedings of the 19th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded SystemsJune 2018Pages 31–43https://doi.org/10.1145/3211332.3211336

Published:19 June 2018Publication History

Related Artifact: Adaptive Deep Learning Model Selection on Embedded Systems May 2018 software https://doi.org/10.5281/zenodo.1242583

LCTES 2018: Proceedings of the 19th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

Pages 31–43

ABSTRACT

The recent ground-breaking advances in deep learning networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-limited embedded devices. Offloading the computation into the cloud is often infeasible due to privacy concerns, high latency, or the lack of connectivity. As such, there is a critical need to find a way to effectively execute the DNN models locally on the devices.

This paper presents an adaptive scheme to determine which DNN model to use for a given input, by considering the desired accuracy and inference time. Our approach employs machine learning to develop a predictive model to quickly select a pre-trained DNN to use for a given input and the optimization constraint. We achieve this by first training off-line a predictive model, and then use the learnt model to select a DNN model to use for new, unseen inputs. We apply our approach to the image classification task and evaluate it on a Jetson TX2 embedded deep learning platform using the ImageNet ILSVRC 2012 validation dataset. We consider a range of influential DNN models. Experimental results show that our approach achieves a 7.52% improvement in inference accuracy, and a 1.8x reduction in inference time over the most-capable single DNN model.

References

JJ Allaire, Dirk Eddelbuettel, Nick Golding, and Yuan Tang. 2016. TensorFlow for R. https://tensorflow.rstudio.com/Google Scholar
Dario Amodei et al. 2016. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. In ICML ’16. Google ScholarDigital Library
Dzmitry Bahdanau et al. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).Google Scholar
Sourav Bhattacharya and Nicholas D Lane. 2016. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Conference on Embedded Networked Sensor Systems. Google ScholarDigital Library
Alfredo Canziani, Adam Paszke, and Eugenio Culurciello. 2016. An Analysis of Deep Neural Network Models for Practical Applications. CoRR (2016).Google Scholar
Shizhao Chen et al. 2018. Adaptive Optimization of Sparse MatrixVector Multiplication on Emerging Many-Core Architectures. In HPCC ’18.Google Scholar
Wenlin Chen et al. 2015. Compressing Neural Networks with the Hashing Trick. In ICML ’16. Google ScholarDigital Library
Kyunghyun Cho et al. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In EMNLP ’14.Google Scholar
Chris Cummins et al. 2017. End-to-end Deep Learning of Optimization Heuristics. In PACT ’17.Google Scholar
Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resourceefficient and QoS-aware Cluster Management. In ASPLOS ’14. Google ScholarDigital Library
Jeff Donahue et al. 2014. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. In ICML ’14. Google ScholarDigital Library
Murali Krishna Emani et al. 2013. Smart, adaptive mapping of parallelism in the presence of external workload. In CGO ’13.Google Scholar
Murali Krishna Emani and Michael O’Boyle. 2015. Celebrating Diversity: A Mixture of Experts Approach for Runtime Mapping in Dynamic Environments. In PLDI ’15. Google ScholarDigital Library
Petko Georgiev et al. 2017. Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network Representations. ACM Interact. Mob. Wearable Ubiquitous Technol. (2017). Google ScholarDigital Library
Dominik Grewe et al. 2011. A workload-aware mapping approach for data-parallel programs. In HiPEAC ’11. Google ScholarDigital Library
Dominik Grewe et al. 2013. OpenCL task partitioning in the presence of GPU contention. In LCPC ’13.Google Scholar
Dominik Grewe et al. 2013. Portable mapping of data parallel programs to OpenCL for heterogeneous systems. In CGO ’13.Google Scholar
Tian Guo. 2017. Towards Efficient Deep Inference for Mobile Applications. CoRR abs/1707.04610 (2017).Google Scholar
Song Han et al. 2015. Learning both weights and connections for efficient neural network. In NIPS ’15. Google ScholarDigital Library
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: efficient inference engine on compressed deep neural network. In ISCA ’16. Google ScholarDigital Library
M Hassaballah et al. 2016. Image features detection, description and matching. In Image Feature Detectors and Descriptors.Google Scholar
Kaiming He et al. 2016. Deep residual learning for image recognition. In CVPR ’16.Google Scholar
Kaiming He et al. 2016. Identity mappings in deep residual networks. In ECCV ’16.Google Scholar
Andrew G. Howard et al. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).Google Scholar
Loc N. Huynh et al. 2017. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications. In MobiSys ’17. Google ScholarDigital Library
Forrest N. Iandola et al. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size. CoRR abs/1602.07360 (2016).Google Scholar
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In ICML ’15. Google ScholarDigital Library
Jonghoon Jin, Aysegul Dundar, and Eugenio Culurciello. 2015. Flattened Convolutional Neural Networks for Feedforward Acceleration. (2015).Google Scholar
Yiping Kang et al. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In ASPLOS ’17. Google ScholarDigital Library
Aaron Klein et al. 2016. Fast bayesian optimization of machine learning hyperparameters on large datasets. arXiv preprint arXiv:1605.07079 (2016).Google Scholar
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. In NIPS ’12. Google ScholarDigital Library
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. DeepX: A software accelerator for low-power deep learning inference on mobile devices. In IPSN ’16. Google ScholarDigital Library
Seyyed Salar Latifi Oskouei et al. 2016. Cnndroid: GPU-accelerated execution of trained deep convolutional neural networks on android. In Multimedia Conference. Google ScholarDigital Library
Honglak Lee et al. 2009. Unsupervised Feature Learning for Audio Classification Using Convolutional Deep Belief Networks. In NIPS ’09. Google ScholarDigital Library
Vicent Sanz Marco et al. 2017. Improving Spark Application Throughput via Memory Aware Task Co-location: A Mixture of Experts Approach. In Middleware ’17.Google Scholar
Mohammad Motamedi et al. 2017. Machine Intelligence on ResourceConstrained IoT Devices: The Case of Thread Granularity Optimization for CNN Inference. ACM Trans. Embed. Comput. Syst. (2017). Google ScholarDigital Library
William F Ogilvie et al. 2014. Fast automatic heuristic construction using active learning. In LCPC ’14.Google Scholar
William F Ogilvie et al. 2017. Minimizing the cost of iterative compilation with active learning. In CGO ’17. Google ScholarDigital Library
Seyed Ali Ossia, Ali Shahin Shamsabadi, Ali Taheri, Hamid R Rabiee, Nic Lane, and Hamed Haddadi. 2017. A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics. arXiv preprint arXiv:1703.02952 (2017).Google Scholar
Omkar M Parkhi et al. 2015. Deep Face Recognition. In BMVC ’15.Google Scholar
Sundari K. Rallapalli et al. 2016. Are Very Deep Neural Networks Feasible on Mobile Devices? Technical Report. University of Southern California.Google Scholar
Mohammad Rastegari et al. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. CoRR abs/1603.05279 (2016).Google Scholar
Sujith Ravi. 2015. ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections. arXiv:1708.00630 (2015).Google Scholar
Jie Ren et al. 2017. Optimise web browsing on heterogeneous mobile platforms: a machine learning based approach. In INFOCOM ’17.Google Scholar
Sandra Servia Rodríguez et al. 2017. Personal Model Training under Privacy Constraints. CoRR abs/1703.00380 (2017).Google Scholar
Olga Russakovsky et al. 2015. ImageNet Large Scale Visual Recognition Challenge. In IJCV ’15. Google ScholarDigital Library
Faiza Samreen et al. 2016. Daleel: Simplifying Cloud Instance Selection Using Machine Learning. In NOMS ’16.Google Scholar
Nathan Silberman and Sergio Guadarrama. 2013. TensorFlow-slim image classification library. https://github.com/tensorflow/models/tree/master/research/slim. (2013).Google Scholar
Mingcong Song, Yang Hu, Huixiang Chen, and Tao Li. 2017. Towards Pervasive and User Satisfactory CNN across GPU Microarchitectures. In HPCA ’17.Google Scholar
Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin A. Riedmiller. 2014. Striving for Simplicity: The All Convolutional Net. CoRR abs/1412.6806 (2014).Google Scholar
Yi Sun, Yuheng Chen, et al. 2014. Deep learning face representation by joint identification-verification. In NIPS ’14. Google ScholarDigital Library
Ben Taylor et al. 2017. Adaptive optimization for OpenCL programs on embedded heterogeneous systems. In LCTES ’17. Google ScholarDigital Library
Surat Teerapittayanon et al. 2017. Distributed deep neural networks over the cloud, the edge and end devices. In ICDCS ’17.Google Scholar
Georgios Tournavitis et al. 2009. Towards a Holistic Approach to Auto-parallelization: Integrating Profile-driven Parallelism Detection and Machine-learning Based Mapping. In PLDI ’09. Google ScholarDigital Library
Zheng Wang et al. 2014. Automatic and Portable Mapping of Data Parallel Programs to OpenCL for GPU-Based Heterogeneous Systems. ACM TACO (2014). Google ScholarDigital Library
Zheng Wang et al. 2014. Integrating profile-driven parallelism detection and machine-learning-based mapping. ACM TACO (2014). Google ScholarDigital Library
Zheng Wang and Michael O’Boyle. 2018. Machine Learning in Compiler Optimisation. Proc. IEEE (2018).Google ScholarCross Ref
Zheng Wang and Michael F.P. O’Boyle. 2009. Mapping Parallelism to Multi-cores: A Machine Learning Based Approach. In PPoPP ’09. Google ScholarDigital Library
Zheng Wang and Michael FP O’Boyle. 2010. Partitioning streaming parallelism for multi-cores: a machine learning based approach. In PACT ’10. Google ScholarDigital Library
Zheng Wang and Michael FP O’boyle. 2013. Using machine learning to partition streaming programs. ACM TACO (2013). Google ScholarDigital Library
Peng Zhang, et al. 2018. Auto-tuning Streamed Applications on Intel Xeon Phi. In IPDPS ’18.Google Scholar
Will Y Zou et al. 2013. Bilingual word embeddings for phrase-based machine translation. In EMNLP ’13.Google Scholar

Index Terms

Adaptive deep learning model selection on embedded systems
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
      1. Embedded software
2. Computing methodologies
  1. Parallel computing methodologies

Recommendations

Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection

Deep neural networks (DNNs) are becoming a key enabling technique for many application domains. However, on-device inference on battery-powered, resource-constrained embedding systems is often infeasible due to prohibitively long inferencing time and ...
Read More
Adaptive deep learning model selection on embedded systems
LCTES '18

The recent ground-breaking advances in deep learning networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-limited embedded devices. Offloading the computation into the ...
Read More
Moving convolutional neural networks to embedded systems: the alexnet and VGG-16 case
IPSN '18: Proceedings of the 17th ACM/IEEE International Conference on Information Processing in Sensor Networks

Execution of deep learning solutions is mostly restricted to high performing computing platforms, e.g., those endowed with GPUs or FPGAs, due to the high demand on computation and memory such solutions require. Despite the fact that dedicated hardware ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
LCTES 2018: Proceedings of the 19th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems
June 2018
112 pages
ISBN:9781450358033
DOI:10.1145/3211332
General Chair:
Zheng Zhang
Rutgers University, USA
,
Program Chair:
Christophe Dubach
University of Edinburgh, UK
ACM SIGPLAN Notices Volume 53, Issue 6
LCTES '18
June 2018
112 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/3299710
Editor:
Matthew Fluet
Rodchester Institude of Technology
Issue’s Table of Contents
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 June 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
Author Tags
Adaptive computing
Deep learning
Embedded systems
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate116of438submissions,26%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 108
  Total Citations
  View Citations
- 1,990
  Total Downloads
- Downloads (Last 12 months)170
- Downloads (Last 6 weeks)19
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Adaptive deep learning model selection on embedded systems

LCTES 2018: Proceedings of the 19th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems

ABSTRACT

References

Cited By

Index Terms

Recommendations

Optimizing Deep Learning Inference on Embedded Systems Through Adaptive Model Selection

Adaptive deep learning model selection on embedded systems

Moving convolutional neural networks to embedded systems: the alexnet and VGG-16 case