Data Aggregation Aware Routing for Distributed Training

Chen, Zhaohong; Long, Xin; Wu, Yalan; Chen, Long; Wu, Jigang; Liu, Shuangyin

doi:10.1007/978-3-030-69244-5_21

Zhaohong Chen¹¹,
Xin Long¹¹,
Yalan Wu¹¹,
Long Chen¹¹,
Jigang Wu¹¹ &
…
Shuangyin Liu¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12606))

Included in the following conference series:

International Conference on Parallel and Distributed Computing: Applications and Technologies

1062 Accesses
3 Citations

Abstract

For distributed training, the communication overhead for parameter synchronization is heavy in the network. Data aggregation can efficiently alleviate network overheads. However, existing works on data aggregation are based on the streaming message data, which can not well adapt to the discrete communication for parameter synchronization. This paper formulates a data aggregation aware routing problem, with the objective of minimizing training finishing time for global model under the constraint of cache capacity. The problem is formulated as a mixed-integer non-linear programming problem, and it is proved to be NP-Hard. Then we propose a data aggregation aware routing algorithm to solve the formulated problem, by transmitting the data to the closest aggregation node in greedy to reduce the network overhead. Simulation results show that, the proposed algorithm can reduce average training finishing time by \(74\%\), and it can reduce the network overhead by \(33\%\) on average, compared with the shortest path algorithm.

This work was supported in part by project of Guangdong Science and Technology Plan under Grant 2019B010121001, Guangzhou Innovation Platform Construction Plan under Grant 201905010006, National Natural Science Foundation of China under Grant 61871475, 61702115 and 62072118 and Jieyang R&D Foundation of Guangdong, China (2017xm037).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rahman, H., Ahmed, N., Hussain, M.I.: A qos-aware hybrid data aggregation scheme for internet of things. Ann. Telecommun. 73(7), 475–486 (2018)
Article Google Scholar
Redondi, A.E., Cesana, M., Fratta, L., Capone, A., Borgonovo, F.: A prediction-based approach for features aggregation in visual sensor networks. Ad Hoc Netw. 83(1), 55–67 (2019)
Article Google Scholar
Cui, J., Boussetta, K., Valois, F.: Classification of data aggregation functions in wireless sensor networks. Comput. Netw. 178(1), 1–46 (2020)
Google Scholar
Chen, C.C.Y., Das, S.K.: Breadth-first traversal of trees and integer sorting in parallel. Inf. Process. Lett. 41(1), 39–49 (1992)
Article MathSciNet Google Scholar
Segev, A.: The node-weighted steiner tree problem. Networks 17(1), 1–17 (1987)
Article MathSciNet Google Scholar
Johnson, D.B.: A note on dijkstra’s shortest path algorithm. J. ACM 20(3), 385–388 (1973)
Article MathSciNet Google Scholar
Yang, S., Li, F., Trajanovski, S., Chen, X., Wang, Y., Fu, X.: Delay-aware virtual network function placement and routing in edge clouds. IEEE Trans. Mob. Comput. 1–14 (2019)
Google Scholar
Li, C., Tang, J., Tang, H., Luo, Y.: Collaborative cache allocation and task scheduling for data-intensive applications in edge computing environment. Future Gener. Comput. Syst. 95, 249–264 (2019)
Article Google Scholar
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9. IEEE (2015)
Google Scholar
Qassim, H., Verma, A., Feinzimer, D.: Compressed residual-VGG16 CNN model for big data places image recognition. In: IEEE Annual Computing and Communication Workshop and Conference, pp. 169–175. IEEE (2018)
Google Scholar
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 1–9 (2017)
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6848–6856. IEEE (2018)
Google Scholar
Garcia-Luna-Aceves, J.J.: A distributed, loop-free, shortest-path routing algorithm. In: Proceedings of the IEEE Conference on Computer Communications, pp. 1125–1137. IEEE (1988)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
Zhaohong Chen, Xin Long, Yalan Wu, Long Chen & Jigang Wu
Guangzhou Key Laboratory of Agricultural Products Quality and Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou, China
Shuangyin Liu

Authors

Zhaohong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xin Long
View author publications
You can also search for this author in PubMed Google Scholar
Yalan Wu
View author publications
You can also search for this author in PubMed Google Scholar
Long Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jigang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Shuangyin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jigang Wu .

Editor information

Editors and Affiliations

Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yong Zhang
Shenzhen Institutes of Advanced Technology, Shenzhen, China
Yicheng Xu
Griffith University, Gold Coast, QLD, Australia
Hui Tian

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, Z., Long, X., Wu, Y., Chen, L., Wu, J., Liu, S. (2021). Data Aggregation Aware Routing for Distributed Training. In: Zhang, Y., Xu, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2020. Lecture Notes in Computer Science(), vol 12606. Springer, Cham. https://doi.org/10.1007/978-3-030-69244-5_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-69244-5_21
Published: 21 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69243-8
Online ISBN: 978-3-030-69244-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics