research-article

Public Access

Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures

Authors:
Forrest Iandola

DeepScale and UC Berkeley EECS

DeepScale and UC Berkeley EECS
View Profile

,
Kurt Keutzer

DeepScale and UC Berkeley EECS

DeepScale and UC Berkeley EECS
View Profile

CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis CompanionOctober 2017Article No.: 1Pages 1–10https://doi.org/10.1145/3125502.3125606

Published:15 October 2017Publication History

CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion

Pages 1–10

ABSTRACT

Over the last five years Deep Neural Nets have offered more accurate solutions to many problems in speech recognition, and computer vision, and these solutions have surpassed a threshold of acceptability for many applications. As a result, Deep Neural Networks have supplanted other approaches to solving problems in these areas, and enabled many new applications. While the design of Deep Neural Nets is still something of an art form, in our work we have found basic principles of design space exploration used to develop embedded microprocessor architectures to be highly applicable to the design of Deep Neural Net architectures. In particular, we have used these design principles to create a novel Deep Neural Net called SqueezeNet that requires only 480KB of storage for its model parameters. We have further integrated all these experiences to develop something of a playbook for creating small Deep Neural Nets for embedded systems.

References

Abadi, M. et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. (2016).Google Scholar
Anisimov, D. and Khanova, T. 2017. Towards lightweight convolutional neural networks for object detection. arXiv:1707.01395 [cs]. (Jul. 2017).Google Scholar
Avary, M. 2017. Telecommunication Charges in Over-the-Air Updates (personal communication).Google Scholar
Badrinarayanan, V. et al. 2015. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561. (2015).Google Scholar
Buciluú, C. et al. 2006. Model Compression. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2006), 535--541. Google ScholarDigital Library
Canziani, A. et al. 2016. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678. (2016).Google Scholar
Chen, T. et al. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274. (2015).Google Scholar
Deng, J. et al. 2009. Imagenet: A large-scale hierarchical image database. Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), 248--255.Google ScholarCross Ref
Gatys, L.A. et al. 2015. A Neural Algorithm of Artistic Style. arXiv:1508.06576 [cs, q-bio]. (Aug. 2015).Google Scholar
Goodfellow, I. et al. 2016. Deep learning. MIT press.Google Scholar
Gries, M. 2004. Methods for evaluating and covering the design space during early design development. Integration, the VLSI journal. 38, 2 (2004), 131--183.Google Scholar
Gries, M. and Keutzer, K. 2006. Building ASIPS: The MESCAL methodology. Springer Science & Business Media.Google Scholar
Gschwend, D. 2016. Zynqnet: An fpga-accelerated embedded convolutional neural network. Master's thesis, Swiss Federal Institute of Technology Zurich (ETH-Zurich).Google Scholar
Han, S. et al. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149. (2015).Google Scholar
Han, S. et al. 2015. Learning Both Weights and Connections for Efficient Neural Networks. Proceedings of the 28th International Conference on Neural Information Processing Systems (Cambridge, MA, USA, 2015), 1135--1143.Google Scholar
He, K. et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (2016), 770--778.Google Scholar
Hochreiter, S. and Schmidhuber, J. 1997. Long short-term memory. Neural computation. 9, 8 (1997), 1735--1780. Google ScholarDigital Library
Horowitz, M. 2014. 1.1 computing's energy problem (and what we can do about it). Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International (2014), 10--14.Google ScholarCross Ref
How HBO's Silicon Valley built "Not Hotdog" with mobile TensorFlow, Keras & React Native: 2017. https://hackernoon.com/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3.Google Scholar
Howard, A.G. et al. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. (2017).Google Scholar
Howard, A.G. 2013. Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:1312.5402. (2013).Google Scholar
Huang, J. et al. 2016. Speed/accuracy trade-offs for modern convolutional object detectors. arXiv preprint arXiv:1611.10012. (2016).Google Scholar
Iandola, F. 2016. Exploring the Design Space of Deep Convolutional Neural Networks at Large Scale. arXiv preprint arXiv:1612.06519. (2016).Google Scholar
Iandola, F.N. et al. 2016. Firecaffe: near-linear acceleration of deep neural network training on compute clusters. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 2592--2600.Google Scholar
Iandola, F.N. et al. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. (2016).Google Scholar
Ioannou, Y. et al. 2016. Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups. arXiv:1605.06489 [cs]. (May 2016).Google Scholar
Jia, Y. et al. 2014. Caffe: Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on Multimedia (2014), 675--678.Google ScholarDigital Library
Krizhevsky, A. et al. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (2012), 1097--1105.Google Scholar
Lecun, Y. et al. 1990. Optimal brain damage. Advances in Neural Information Processing Systems (1990).Google Scholar
Li, Q. et al. 2017. Mimicking Very Efficient Network for Object Detection. (2017), 6356--6364.Google Scholar
Lightweight Neural Style on Pytorch: https://github.com/lizeng614/SqueezeNet-Neural-Style-Pytorch.Google Scholar
Lin, M. et al. 2013. Network in network. arXiv preprint arXiv:1312.4400. (2013).Google Scholar
Liu, W. et al. 2016. Ssd: Single shot multibox detector. European conference on computer vision (2016), 21--37.Google Scholar
Nair, V. and Hinton, G.E. 2010. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th international conference on machine learning (ICML-10) (2010), 807--814.Google ScholarDigital Library
neural-style: https://github.com/jcjohnson/neural-style.Google Scholar
Paszke, A. et al. 2016. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147. (2016).Google Scholar
Pothos, V.K. et al. A fast, embedded implementation of a Convolutional Neural Network for Image Recognition.Google Scholar
Rastegari, M. et al. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. Computer Vision - ECCV 2016 (Oct. 2016), 525--542.Google Scholar
Redmon, J. et al. 2016. You Only Look Once: Unified, Real-Time Object Detection. (2016), 779--788.Google Scholar
Ren, S. et al. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 39, 6 (Jun. 2017), 1137--1149. Google ScholarDigital Library
Schmidhuber, J. 1997. Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability. Neural Networks. 10, 5 (Jul. 1997), 857--873. Google ScholarDigital Library
Schumacher, E.F. 2011. Small is beautiful: A study of economics as if people mattered. Random House.Google Scholar
Shah, N. and Keutzer, K. 2002. Network processors: Origin of species. The 17th International Symposium of Computer and Information Science (2002).Google Scholar
Simonyan, K. and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014).Google Scholar
Sun, D. et al. 2017. Enabling Embedded Inference Engine with ARM Compute Library: A Case Study. arXiv:1704.03751 [cs]. (Apr. 2017).Google Scholar
Szegedy, C. et al. 2015. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition (2015), 1--9.Google Scholar
Szegedy, C. et al. 2016. Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 2818--2826.Google Scholar
Wu, B. et al. 2017. SqueezeDet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. CVPR Embedded Vision Workshop (2017).Google Scholar
Wu, C. et al. 2017. A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation. arXiv preprint arXiv:1703.04071. (2017).Google Scholar
Xiao, X. et al. 2017. Building Fast and Compact Convolutional Neural Networks for Offline Handwritten Chinese Character Recognition. arXiv preprint arXiv:1702.07975. (2017).Google Scholar
Xie, S. et al. 2016. Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431. (2016).Google Scholar
Xu, B. 2016. Deep Convolutional Networks for Image Classification. University of Alberta.Google Scholar
Zhang, X. et al. 2017. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. arXiv:1707.01083 [cs]. (Jul. 2017).Google Scholar

Index Terms

Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Embedded systems
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Deep convolutional player modeling on log and level data
FDG '17: Proceedings of the 12th International Conference on the Foundations of Digital Games

We present a novel approach to player modeling based on a convolutional neural net trained on game event logs. We test our approach and a hybrid extension over two distinct games, a clone of Super Mario Bros. and Gwario, a human computation version of ...
Read More
Ensemble of fine‐tuned convolutional neural networks for urine sediment microscopic image classification

In this study, an ensemble of fine‐tuned convolutional neural networks (CNNs) is proposed. As CNN training requires large annotated data, which are lacking in the field of urine sediment microscopic image processing, the authors first pre‐trained the CNNs,...
Read More
Towards dropout training for convolutional neural networks

Recently, dropout has seen increasing use in deep learning. For deep convolutional neural networks, dropout is known to work well in fully-connected layers. However, its effect in convolutional and pooling layers is still not clear. This paper ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion
October 2017
84 pages
ISBN:9781450351850
DOI:10.1145/3125502

Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 October 2017
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
convolutional neural nets
deep learning
deep neural nets
embedded computer vision
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate280of864submissions,32%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 22
  Total Citations
  View Citations
- 1,157
  Total Downloads
- Downloads (Last 12 months)97
- Downloads (Last 6 weeks)13
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures

CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep convolutional player modeling on log and level data

Ensemble of fine‐tuned convolutional neural networks for urine sediment microscopic image classification

Towards dropout training for convolutional neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures

CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep convolutional player modeling on log and level data

Ensemble of fine‐tuned convolutional neural networks for urine sediment microscopic image classification

Towards dropout training for convolutional neural networks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media