skip to main content
10.1145/3125502.3125606acmotherconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article
Public Access

Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures

Published:15 October 2017Publication History

ABSTRACT

Over the last five years Deep Neural Nets have offered more accurate solutions to many problems in speech recognition, and computer vision, and these solutions have surpassed a threshold of acceptability for many applications. As a result, Deep Neural Networks have supplanted other approaches to solving problems in these areas, and enabled many new applications. While the design of Deep Neural Nets is still something of an art form, in our work we have found basic principles of design space exploration used to develop embedded microprocessor architectures to be highly applicable to the design of Deep Neural Net architectures. In particular, we have used these design principles to create a novel Deep Neural Net called SqueezeNet that requires only 480KB of storage for its model parameters. We have further integrated all these experiences to develop something of a playbook for creating small Deep Neural Nets for embedded systems.

References

  1. Abadi, M. et al. 2016. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. (2016).Google ScholarGoogle Scholar
  2. Anisimov, D. and Khanova, T. 2017. Towards lightweight convolutional neural networks for object detection. arXiv:1707.01395 [cs]. (Jul. 2017).Google ScholarGoogle Scholar
  3. Avary, M. 2017. Telecommunication Charges in Over-the-Air Updates (personal communication).Google ScholarGoogle Scholar
  4. Badrinarayanan, V. et al. 2015. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. arXiv preprint arXiv:1511.00561. (2015).Google ScholarGoogle Scholar
  5. Buciluú, C. et al. 2006. Model Compression. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (New York, NY, USA, 2006), 535--541. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Canziani, A. et al. 2016. An analysis of deep neural network models for practical applications. arXiv preprint arXiv:1605.07678. (2016).Google ScholarGoogle Scholar
  7. Chen, T. et al. 2015. Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274. (2015).Google ScholarGoogle Scholar
  8. Deng, J. et al. 2009. Imagenet: A large-scale hierarchical image database. Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on (2009), 248--255.Google ScholarGoogle ScholarCross RefCross Ref
  9. Gatys, L.A. et al. 2015. A Neural Algorithm of Artistic Style. arXiv:1508.06576 [cs, q-bio]. (Aug. 2015).Google ScholarGoogle Scholar
  10. Goodfellow, I. et al. 2016. Deep learning. MIT press.Google ScholarGoogle Scholar
  11. Gries, M. 2004. Methods for evaluating and covering the design space during early design development. Integration, the VLSI journal. 38, 2 (2004), 131--183.Google ScholarGoogle Scholar
  12. Gries, M. and Keutzer, K. 2006. Building ASIPS: The MESCAL methodology. Springer Science & Business Media.Google ScholarGoogle Scholar
  13. Gschwend, D. 2016. Zynqnet: An fpga-accelerated embedded convolutional neural network. Master's thesis, Swiss Federal Institute of Technology Zurich (ETH-Zurich).Google ScholarGoogle Scholar
  14. Han, S. et al. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149. (2015).Google ScholarGoogle Scholar
  15. Han, S. et al. 2015. Learning Both Weights and Connections for Efficient Neural Networks. Proceedings of the 28th International Conference on Neural Information Processing Systems (Cambridge, MA, USA, 2015), 1135--1143.Google ScholarGoogle Scholar
  16. He, K. et al. 2016. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition (2016), 770--778.Google ScholarGoogle Scholar
  17. Hochreiter, S. and Schmidhuber, J. 1997. Long short-term memory. Neural computation. 9, 8 (1997), 1735--1780. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Horowitz, M. 2014. 1.1 computing's energy problem (and what we can do about it). Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International (2014), 10--14.Google ScholarGoogle ScholarCross RefCross Ref
  19. How HBO's Silicon Valley built "Not Hotdog" with mobile TensorFlow, Keras & React Native: 2017. https://hackernoon.com/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3.Google ScholarGoogle Scholar
  20. Howard, A.G. et al. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. (2017).Google ScholarGoogle Scholar
  21. Howard, A.G. 2013. Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:1312.5402. (2013).Google ScholarGoogle Scholar
  22. Huang, J. et al. 2016. Speed/accuracy trade-offs for modern convolutional object detectors. arXiv preprint arXiv:1611.10012. (2016).Google ScholarGoogle Scholar
  23. Iandola, F. 2016. Exploring the Design Space of Deep Convolutional Neural Networks at Large Scale. arXiv preprint arXiv:1612.06519. (2016).Google ScholarGoogle Scholar
  24. Iandola, F.N. et al. 2016. Firecaffe: near-linear acceleration of deep neural network training on compute clusters. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 2592--2600.Google ScholarGoogle Scholar
  25. Iandola, F.N. et al. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. (2016).Google ScholarGoogle Scholar
  26. Ioannou, Y. et al. 2016. Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups. arXiv:1605.06489 [cs]. (May 2016).Google ScholarGoogle Scholar
  27. Jia, Y. et al. 2014. Caffe: Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM international conference on Multimedia (2014), 675--678.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Krizhevsky, A. et al. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems (2012), 1097--1105.Google ScholarGoogle Scholar
  29. Lecun, Y. et al. 1990. Optimal brain damage. Advances in Neural Information Processing Systems (1990).Google ScholarGoogle Scholar
  30. Li, Q. et al. 2017. Mimicking Very Efficient Network for Object Detection. (2017), 6356--6364.Google ScholarGoogle Scholar
  31. Lightweight Neural Style on Pytorch: https://github.com/lizeng614/SqueezeNet-Neural-Style-Pytorch.Google ScholarGoogle Scholar
  32. Lin, M. et al. 2013. Network in network. arXiv preprint arXiv:1312.4400. (2013).Google ScholarGoogle Scholar
  33. Liu, W. et al. 2016. Ssd: Single shot multibox detector. European conference on computer vision (2016), 21--37.Google ScholarGoogle Scholar
  34. Nair, V. and Hinton, G.E. 2010. Rectified linear units improve restricted boltzmann machines. Proceedings of the 27th international conference on machine learning (ICML-10) (2010), 807--814.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. neural-style: https://github.com/jcjohnson/neural-style.Google ScholarGoogle Scholar
  36. Paszke, A. et al. 2016. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147. (2016).Google ScholarGoogle Scholar
  37. Pothos, V.K. et al. A fast, embedded implementation of a Convolutional Neural Network for Image Recognition.Google ScholarGoogle Scholar
  38. Rastegari, M. et al. 2016. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks. Computer Vision - ECCV 2016 (Oct. 2016), 525--542.Google ScholarGoogle Scholar
  39. Redmon, J. et al. 2016. You Only Look Once: Unified, Real-Time Object Detection. (2016), 779--788.Google ScholarGoogle Scholar
  40. Ren, S. et al. 2017. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 39, 6 (Jun. 2017), 1137--1149. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Schmidhuber, J. 1997. Discovering Neural Nets with Low Kolmogorov Complexity and High Generalization Capability. Neural Networks. 10, 5 (Jul. 1997), 857--873. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Schumacher, E.F. 2011. Small is beautiful: A study of economics as if people mattered. Random House.Google ScholarGoogle Scholar
  43. Shah, N. and Keutzer, K. 2002. Network processors: Origin of species. The 17th International Symposium of Computer and Information Science (2002).Google ScholarGoogle Scholar
  44. Simonyan, K. and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. (2014).Google ScholarGoogle Scholar
  45. Sun, D. et al. 2017. Enabling Embedded Inference Engine with ARM Compute Library: A Case Study. arXiv:1704.03751 [cs]. (Apr. 2017).Google ScholarGoogle Scholar
  46. Szegedy, C. et al. 2015. Going deeper with convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition (2015), 1--9.Google ScholarGoogle Scholar
  47. Szegedy, C. et al. 2016. Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016), 2818--2826.Google ScholarGoogle Scholar
  48. Wu, B. et al. 2017. SqueezeDet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. CVPR Embedded Vision Workshop (2017).Google ScholarGoogle Scholar
  49. Wu, C. et al. 2017. A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation. arXiv preprint arXiv:1703.04071. (2017).Google ScholarGoogle Scholar
  50. Xiao, X. et al. 2017. Building Fast and Compact Convolutional Neural Networks for Offline Handwritten Chinese Character Recognition. arXiv preprint arXiv:1702.07975. (2017).Google ScholarGoogle Scholar
  51. Xie, S. et al. 2016. Aggregated residual transformations for deep neural networks. arXiv preprint arXiv:1611.05431. (2016).Google ScholarGoogle Scholar
  52. Xu, B. 2016. Deep Convolutional Networks for Image Classification. University of Alberta.Google ScholarGoogle Scholar
  53. Zhang, X. et al. 2017. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices. arXiv:1707.01083 [cs]. (Jul. 2017).Google ScholarGoogle Scholar

Index Terms

  1. Small neural nets are beautiful: enabling embedded systems with small deep-neural-network architectures

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion
        October 2017
        84 pages
        ISBN:9781450351850
        DOI:10.1145/3125502

        Copyright © 2017 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 October 2017

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate280of864submissions,32%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader