Abstract
In this paper, the task scheduling in MapReduce is considered for geo-distributed data centers on heterogeneous networks. Adaptive heartbeats, job deadlines and data locality are concerned. Job deadlines are divided according to the maximum data volume of tasks. With the considered constraints, the task scheduling is formulated as an assignment problem in each heartbeat, in which adaptive heartbeats are calculated by the processing times of tasks, jobs are sequencing in terms of the divided deadlines and tasks are scheduled by the Hungarian algorithm. Taking into account both the data transfer and processing times, the most suitable data center for all mapped jobs are determined in the reduce phase. Experimental results show that the proposed algorithms outperform the current existing ones. The proposals with sorted task-sequences have better performance than those with random task-sequences.
Similar content being viewed by others
References
Magnusson, J., Kvernvik, T.: Subscriber classification within telecom networks utilizing big data technologies and machine learning. In: Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications, pp. 77–84. ACM (2012)
Graaff, A.J., Engelbrecht, A.P.: Clustering data in stationary environments with a local network neighborhood artificial immune system. Int. J. Mach. Learn. Cybern. 3(1), 1–26 (2012)
Li, Y., Liu, C., Gao, J.X., Shen, W.: An intergrated feature-based dynamic control system for online machining, inspection and monitoring. Integr. Comput. Aided Eng. 22(2), 187–200 (2015)
Li, Y., Liu, C., Hao, X., Gao, J.X., Maropoulos, P.G.: Responsive fixture design using dynamic product inspection and monitoring technologies for the precision machining of large-scale aerospace parts. CIRP Ann. Manuf. Technol. 64, 173–176 (2015)
Dou, Y., Huang, Y., Li, Q., Luo, S.: A fast template matching-based algorithms for railway bolts detection. Int. J. Mach. Learn. Cybern. 5(6), 835–844 (2014)
Tauer, G., Nagi, R.: A map-reduce lagrangian heuristic for multidimensional assignment problems with decomposable costs. Parallel Comput. 39(11), 653–668 (2013)
Guo, Z., Fox, G., Zhou, M.: Investigation of data locality in mapreduce. In: Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 419–426. IEEE Computer Society (2012)
Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European conference on Computer systems, pp. 265–278. ACM (2010)
Fischer, M.J., Su, X., Yin, Y.: Assigning tasks for efficiency in Hadoop. In: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, pp. 30–39. ACM (2010)
Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G., Wu, S.: Maestro: replica-aware map scheduling for mapreduce. In: 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2012, pp. 435–442. IEEE (2012)
Polo, J., Becerra, Y., Carrera, D., Steinder, M., Whalley, I., Torres, J., Ayguad, E.: Deadline-based mapreduce workload management. IEEE Trans. Netw. Serv. Manag. 10(2), 231–244 (2013)
Dong, X., Wang, Y., Liao, H.: Scheduling mixed real-time and non-real-time applications in mapreduce environment. In: IEEE 17th International Conference on Parallel and Distributed Systems, 2011, pp. 9–16. IEEE (2011)
Tang, Z., Zhou, J., Li, K., Li, R.: A mapreduce task scheduling algorithm for deadline constraints. Clust. Comput. 16(4), 651–662 (2013)
Li, H., Wei, X., Fu, Q., Luo, Y.: Mapreduce delay scheduling with deadline constraint. Concurr. Comput. Pract. Exp. 26(3), 766–778 (2014)
Yang, J., Li, X., Wang, D., Wang, J.: A group mining method for big data on distributed vehicle trajectories in wan. Int. J. Distrib. Sens. Netw. (2014). doi:10.1155/2015/756107
White, W.: Hadoop: the definitive guide. O’Reilly Media,Inc., Sebastopol (2012)
Hwang, E., Kim, K.H.: Minimizing cost of virtual machines for deadline-constrained mapreduce applications in teh cloud. In: ACM/IEEE 13th International Conference on Grid Computing, 2012, pp. 130–138. IEEE (2012)
Dou, A., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.H.: Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd international conference on pervasive technologies related to assistive environments. ACM (2010)
Dou, A.J., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V., Foley, S., Yu, C.: Data clustering on a network of mobile smartphones. In: IEEE/IPSJ 11th International Symposium on Applications and the Internet (SAINT), 2011, pp. 118–127. IEEE (2011)
Dou, A.J., Kalogeraki, V., Gunopulos, D., Mielikainen, T., Tuulos, V.: Scheduling for real-time mobile mapreduce systems. In: Proceedings of the 5th ACM international conference on Distributed event-based system, pp. 347–358. ACM (2011)
Laurila, J.K., Gatica-Perez, D., Aad, I., Bornet, O., Do, T.M.T., Dousse, O., Eberle, J., Miettinen, M., et al.: The mobile data challenge: big data for mobile computing research. In: Pervasive Computing, EPFL-CONF-192489. (2012)
Verma, A., Cherkasova, L., Kumar, V.S., Campbell, R.H.: Deadline-based workload management for mapreduce environments: pieces of the performance puzzle. In: Network Opertions and Management Symposium, 2012 IEEE, pp. 900–905. IEEE (2012)
Zhu, Y., Jiang, Y., Wu, W., Ding, L., Teredesai, A., Li, D., Lee, W.: Minimizing makespan and total completion time in mapreduce-like systems. In: Proceedings of IEEE INFOCOM, 2014, pp. 2166–2174. IEEE (2014)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (61572127, 61272377) and the Specialized Research Fund for the Doctoral Program of Higher Education (20120092110027).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, J., Li, X. Task scheduling for MapReduce in heterogeneous networks. Cluster Comput 19, 197–210 (2016). https://doi.org/10.1007/s10586-015-0503-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-015-0503-3