Skip to main content

Adaptive Tensor-Train Decomposition for Neural Network Compression

  • Conference paper
  • First Online:
Parallel and Distributed Computing, Applications and Technologies (PDCAT 2020)

Abstract

It could be of great difficulty and cost to directly apply complex deep neural network to mobile devices with limited computing and endurance abilities. This paper aims to solve such problem through improving the compactness of the model and efficiency of computing. On the basis of MobileNet, a mainstream lightweight neural network, we proposed an Adaptive Tensor-Train Decomposition (ATTD) algorithm to solve the cumbersome problem of finding optimal decomposition rank. For its non-obviousness in the forward acceleration of GPU side, our strategy of choosing to use lower decomposition dimensions and moderate decomposition rank, and the using of dynamic programming, have effectively reduced the number of parameters and amount of computation. And then, we have also set up a real-time target network for mobile devices. With the support of sufficient amount of experiment results, the method proposed in this paper can greatly reduce the number of parameters and amount of computation, improving the model’s speed in deducing on mobile devices.

Y. Zheng and Y. Zhou—Contribute equally to this work and should be considered co-first authors.

This work is partially supported by National Key R&D Program of China with grant No. 2019YFB2102600 and NSFC (No. 61971269, 61832012).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  2. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)

    Google Scholar 

  3. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

    Article  MathSciNet  Google Scholar 

  4. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)

    Google Scholar 

  5. Szegedy, C.: Going deeper with convolutions. In: CVPR, pp. 1–9 (2015)

    Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  7. Sheng, H., et al.: Mining hard samples globally and efficiently for person reidentification. IoT-J 7(10), 9611–9622 (2020)

    Google Scholar 

  8. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NIPS, pp. 1135–1143 (2015)

    Google Scholar 

  9. Srinivas, S., Babu, R.V.: Data-free parameter pruning for deep neural networks (2015)

    Google Scholar 

  10. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks (2017)

    Google Scholar 

  11. Bach, F.R., Jordan, M.I.: Predictive low-rank decomposition for kernel methods. In: ICML, pp. 33–40. Association for Computing Machinery (2005)

    Google Scholar 

  12. Prakash, A., Storer, J., Florencio, D., Zhang, C.: Repr: Improved training of convolutional filters (2018)

    Google Scholar 

  13. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  14. Romero, A., Ballas, N., Kahou, S.E., Chassang, A., Gatta, C., Bengio, Y.: Fitnets: Hints for thin deep nets (2014)

    Google Scholar 

  15. Chen, W., Wilson, J., Tyree, S., Weinberger, K., Chen, Y.: Compressing neural networks with the hashing trick. In: ICML, pp. 2285–2294 (2015)

    Google Scholar 

  16. Cheng, Y., Yu, F.X., Feris, R.S., Kumar, S., Choudhary, A., Chang, S.F.: An exploration of parameter redundancy in deep networks with circulant projections. In: ICCV, pp. 2857–2865 (2015)

    Google Scholar 

  17. Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions (2014)

    Google Scholar 

  18. Zhang, J., Han, Y., Jiang, J.: Tucker decomposition-based tensor learning for human action recognition. Multimedia Syst. 22(3), 343–353 (2015). https://doi.org/10.1007/s00530-015-0464-7

    Article  Google Scholar 

  19. Oseledets, I.V.: Tensor-train decomposition. SIAM J. Sci. Comput. 33(5), 2295–2317 (2011)

    Article  MathSciNet  Google Scholar 

  20. Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.P.: Tensorizing neural networks. In: NIPS, pp. 442–450 (2015)

    Google Scholar 

  21. LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NIPS, pp. 598–605 (1990)

    Google Scholar 

  22. Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding (2015)

    Google Scholar 

  23. Lebedev, V., Lempitsky, V.: Fast convnets using group-wise brain damage. In: CVPR, pp. 2554–2564 (2016)

    Google Scholar 

  24. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference (2016)

    Google Scholar 

  25. Luo, J.H., Wu, J., Lin, W.: Thinet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5058–5066 (2017)

    Google Scholar 

  26. Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: ICML, pp. 1737–1746 (2015)

    Google Scholar 

  27. Ma, Y., Suda, N., Cao, Y., Seo, J.S., Vrudhula, S.: Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: International Conference on Field Programmable Logic and Applications, pp. 1–8 (2016)

    Google Scholar 

  28. Gysel, P.: Ristretto: hardware-oriented approximation of convolutional neural networks (2016)

    Google Scholar 

  29. Courbariaux, M., Bengio, Y., David, J.P.: Binaryconnect: training deep neural networks with binary weights during propagations. In: NIPS, pp. 3123–3131 (2015)

    Google Scholar 

  30. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or \(-\)1 (2016)

    Google Scholar 

  31. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  32. Denil, M., Shakibi, B., Dinh, L., Ranzato, M.A., De Freitas, N.: Predicting parameters in deep learning. In: NIPS, pp. 2148–2156 (2013)

    Google Scholar 

  33. Lebedev, V., Ganin, Y., Rakhuba, M., Oseledets, I., Lempitsky, V.: Speeding-up convolutional neural networks using fine-tuned cp-decomposition (2014)

    Google Scholar 

  34. Tai, C., Xiao, T., Zhang, Y., Wang, X.: Convolutional neural networks with low-rank regularization (2015)

    Google Scholar 

  35. Lin, S., Ji, R., Chen, C., Huang, F.: Espace: accelerating convolutional neural networks via eliminating spatial and channel redundancy. In: AAAI, pp. 1424–1430 (2017)

    Google Scholar 

  36. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 0.5 mb model size. arXiv preprint arXiv:1602.07360 (2016)

  37. Howard, A.G.: Efficient convolutional neural networks for mobile vision applications, Mobilenets (2017)

    Google Scholar 

  38. Hluchyj, M.G., Karol, M.J.: Shuffle net: an application of generalized perfect shuffles to multihop lightwave networks. J. Lightwave Technol. 9(10), 1386–1397 (1991)

    Google Scholar 

  39. Korattikara, A., Rathod, V., Murphy, K., Welling, M.: Bayesian dark knowledge. In: NIPS, pp. 3438–3446 (2015)

    Google Scholar 

  40. Luo, P., Zhu, Z., Liu, Z., Wang, X., Tang, X.: Face model compression by distilling knowledge from neurons. In: AAAI, pp. 3560–3566 (2016)

    Google Scholar 

  41. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: CVPR, pp. 4510–4520 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dongxiao Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zheng, Y., Zhou, Y., Zhao, Z., Yu, D. (2021). Adaptive Tensor-Train Decomposition for Neural Network Compression. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69244-5_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69243-8

  • Online ISBN: 978-3-030-69244-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics