Powering One-Shot Topological NAS with Stabilized Share-Parameter Proxy

Guo, Ronghao; Lin, Chen; Li, Chuming; Tian, Keyu; Sun, Ming; Sheng, Lu; Yan, Junjie

doi:10.1007/978-3-030-58568-6_37

Ronghao Guo¹²,
Chen Lin¹³,
Chuming Li¹³,
Keyu Tian¹²,
Ming Sun¹³,
Lu Sheng¹² &
…
Junjie Yan¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12359))

Included in the following conference series:

European Conference on Computer Vision

3950 Accesses
9 Citations

Abstract

One-shot NAS method has attracted much interest from the research community due to its remarkable training efficiency and capacity to discover high performance models. However, the search spaces of previous one-shot based works usually relied on hand-craft design and were short for flexibility on the network topology. In this work, we try to enhance the one-shot NAS by exploring high-performing network architectures in our large-scale Topology Augmented Search Space (i.e, over \(3.4 \times 10^{10}\) different topological structures). Specifically, the difficulties for architecture searching in such a complex space has been eliminated by the proposed stabilized share-parameter proxy, which employs Stochastic Gradient Langevin Dynamics to enable fast shared parameter sampling, so as to achieve stabilized measurement of architecture performance even in search space with complex topological structures. The proposed method, namely Stablized Topological Neural Architecture Search (ST-NAS), achieves state-of-the-art performance under Multiply-Adds (MAdds) constraint on ImageNet. Our lite model ST-NAS-A achieves \(76.4\%\) top-1 accuracy with only 326M MAdds. Our moderate model ST-NAS-B achieves \(77.9\%\) top-1 accuracy just required 503M MAdds. Both of our models offer superior performances in comparison to other concurrent works on one-shot NAS.

R. Guo and C. Lin—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016)
Bender, G., Kindermans, P.J., Zoph, B., Vasudevan, V., Le, Q.: Understanding and simplifying one-shot architecture search. In: International Conference on Machine Learning, pp. 549–558 (2018)
Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006). https://doi.org/10.1007/978-1-4615-7566-5
Brock, A., Lim, T., Ritchie, J.M., Weston, N.: Smash: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344 (2017)
Cai, H., Zhu, L., Han, S.: Proxylessnas: Direct neural architecture search on target task and hardware. arXiv preprint arXiv:1812.00332 (2018)
Chen, C., Carlson, D., Gan, Z., Li, C., Carin, L.: Bridging the gap between stochastic gradient MCMC and stochastic optimization. In: Artificial Intelligence and Statistics, pp. 1051–1060 (2016)
Google Scholar
Chen, Y., et al.: Reinforced evolutionary neural architecture search. arXiv preprint arXiv:1808.00193 (2018)
Chu, X., Zhang, B., Xu, R., Li, J.: Fairnas: rethinking evaluation fairness of weight sharing neural architecture search. arXiv preprint arXiv:1907.01845 (2019)
Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: learning augmentation policies from data. arXiv preprint arXiv:1805.09501 (2018)
Du, X., et al.: Spinenet: learning scale-permuted backbone for recognition and localization. arXiv preprint arXiv:1912.05027 (2019)
Elsken, T., Metzen, J.H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks. arXiv preprint arXiv:1711.04528 (2017)
Fang, M., Wang, Q., Zhong, Z.: Betanas: balanced training and selective drop for neural architecture search. arXiv preprint arXiv:1912.11191 (2019)
Guo, M., Zhong, Z., Wu, W., Lin, D., Yan, J.: Irlas: inverse reinforcement learning for architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9021–9029 (2019)
Google Scholar
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)
Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Google Scholar
Kendall, M.G.: A new measure of rank correlation. Biometrika 30(1/2), 81–93 (1938)
Article Google Scholar
Li, C., Yuan, X., Lin, C., Guo, M., Wu, W., Yan, J., Ouyang, W.: AM-LFS: automl for loss function search. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8410–8419 (2019)
Google Scholar
Li, X., et al.: Improving one-shot nas by suppressing the posterior fading. arXiv preprint arXiv:1910.02543 (2019)
Liang, F., et al.: Computation reallocation for object detection. arXiv preprint arXiv:1912.11234 (2019)
Lin, C., et al.: Online hyper-parameter learning for auto-augmentation strategy. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6579–6588 (2019)
Google Scholar
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Google Scholar
Liu, C., et al.: Progressive neural architecture search. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 19–34 (2018)
Google Scholar
Liu, H., Simonyan, K., Yang, Y.: Darts: differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)
Lu, Z., et al.: NSGA-net: a multi-objective genetic algorithm for neural architecture search. arXiv preprint arXiv:1810.03522 (2018)
Ma, N., Zhang, X., Zheng, H.T., Sun, J.: Shufflenet v2: practical guidelines for efficient cnn architecture design. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 116–131 (2018)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Google Scholar
Real, E., et al.: Large-scale evolution of image classifiers. In: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 2902–2911. JMLR. org (2017)
Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Stamoulis, D., et al.: Single-path nas: designing hardware-efficient convnets in less than 4 hours. arXiv preprint arXiv:1904.02877 (2019)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: 31st AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Tan, M., et al.: MNASnet: Platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2820–2828 (2019)
Google Scholar
Tan, M., Le, Q.V.: Efficientnet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946 (2019)
Teh, Y.W., Thiery, A.H., Vollmer, S.J.: Consistency and fluctuations for stochastic gradient langevin dynamics. J. Mach. Learn. Res. 17(1), 193–225 (2016)
MathSciNet MATH Google Scholar
Welling, M., Teh, Y.W.: Bayesian learning via stochastic gradient langevin dynamics. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp. 681–688 (2011)
Google Scholar
Wu, B., et al.: FBNet: hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10734–10742 (2019)
Google Scholar
Xie, S., Kirillov, A., Girshick, R., He, K.: Exploring randomly wired neural networks for image recognition. arXiv preprint arXiv:1904.01569 (2019)
Xiong, Y., Mehta, R., Singh, V.: Resource constrained neural network architecture search: will a submodularity assumption help? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1901–1910 (2019)
Google Scholar
Zhang, Y., et al.: Deeper insights into weight sharing in neural architecture search. arXiv preprint arXiv:2001.01431 (2020)
Zhong, Z., et al.: Blockqnn: efficient block-wise neural network architecture generation. arXiv preprint arXiv:1808.05584 (2018)
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)
Google Scholar

Download references

Acknowledgement

This work was supported by the National Key Research and Development Project of China (No. 2018AAA0101900).

Author information

Authors and Affiliations

College of Software, Beihang University, Beijing, China
Ronghao Guo, Keyu Tian & Lu Sheng
SenseTime Research, Hong Kong, China
Chen Lin, Chuming Li, Ming Sun & Junjie Yan

Authors

Ronghao Guo
View author publications
You can also search for this author in PubMed Google Scholar
Chen Lin
View author publications
You can also search for this author in PubMed Google Scholar
Chuming Li
View author publications
You can also search for this author in PubMed Google Scholar
Keyu Tian
View author publications
You can also search for this author in PubMed Google Scholar
Ming Sun
View author publications
You can also search for this author in PubMed Google Scholar
Lu Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Junjie Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lu Sheng .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Guo, R. et al. (2020). Powering One-Shot Topological NAS with Stabilized Share-Parameter Proxy. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12359. Springer, Cham. https://doi.org/10.1007/978-3-030-58568-6_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-58568-6_37
Published: 13 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58567-9
Online ISBN: 978-3-030-58568-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics