Journal Home > Volume 27 , Issue 2

Data-parallel computing platforms, such as Hadoop and Spark, are deployed in computing clusters for big data analytics. There is a general tendency that multiple users share the same computing cluster. The schedule of multiple jobs becomes a serious challenge. Over a long period in the past, the Shortest-Job-First (SJF) method has been considered as the optimal solution to minimize the average job completion time. However, the SJF method leads to a low system throughput in the case where a small number of short jobs consume a large amount of resources. This factor prolongs the average job completion time. We propose an improved heuristic job scheduling method, called the Densest-Job-Set-First (DJSF) method. The DJSF method schedules jobs by maximizing the number of completed jobs per unit time, aiming to decrease the average Job Completion Time (JCT) and improve the system throughput. We perform extensive simulations based on Google cluster data. Compared with the SJF method, the DJSF method decreases the average JCT by 23.19% and enhances the system throughput by 42.19%. Compared with Tetris, the job packing method improves the job completion efficiency by 55.4%, so that the computing platforms complete more jobs in a short time span.


menu
Abstract
Full text
Outline
About this article

Improved Heuristic Job Scheduling Method to Enhance Throughput for Big Data Analytics

Show Author's information Zhiyao HuDongsheng Li( )
College of Computer, National University of Defense Technology, Changsha 410073, China

Abstract

Data-parallel computing platforms, such as Hadoop and Spark, are deployed in computing clusters for big data analytics. There is a general tendency that multiple users share the same computing cluster. The schedule of multiple jobs becomes a serious challenge. Over a long period in the past, the Shortest-Job-First (SJF) method has been considered as the optimal solution to minimize the average job completion time. However, the SJF method leads to a low system throughput in the case where a small number of short jobs consume a large amount of resources. This factor prolongs the average job completion time. We propose an improved heuristic job scheduling method, called the Densest-Job-Set-First (DJSF) method. The DJSF method schedules jobs by maximizing the number of completed jobs per unit time, aiming to decrease the average Job Completion Time (JCT) and improve the system throughput. We perform extensive simulations based on Google cluster data. Compared with the SJF method, the DJSF method decreases the average JCT by 23.19% and enhances the system throughput by 42.19%. Compared with Tetris, the job packing method improves the job completion efficiency by 55.4%, so that the computing platforms complete more jobs in a short time span.

Keywords: big data, job scheduling, job throughput, job completion time, job completion efficiency

References(25)

[1]
S. Singh and Y. Liu, A cloud service architecture for analyzing big monitoring data, Tsinghua Science and Technology, vol. 21, no. 1, pp. 55-70, 2016.
[2]
J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, in Proc. 6th Conf. Operating System Design & Implementation, San Francisco, CA, USA, 2004, pp. 137-150.
[3]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster computing with working sets, in Proc. 2nd USENIX Workshop on Hot Topics in Cloud Computing, Boston, MA, USA, 2020, pp. 1-10.
[4]
J. C. Tang, M. Xu, S. J. Fu, and K. Huang, A scheduling optimization technique based on reuse in spark to defend against apt attack, Tsinghua Science and Technology, vol. 23, no. 5, pp. 550-560, 2018.
[5]
R. Grandl, M. Chowdhury, A. Akella, and G. Ananthanarayanan, Altruistic scheduling in multi-resource clusters, in Proc. 12th USENIX Conf. Operating Systems Design and Implementation, Savannah, GA, USA, 2016, pp. 65-80.
[6]
R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella, Multi-resource packing for cluster schedulers, in Proc. 2014 ACM Special Interest Group on Data Communication, Chicago, IL, USA, 2014, pp. 455-466.
[7]
J. Wilkes, Google cluster data, https://github.com/google/cluster-data, 2020.
[8]
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, Mesos: A platform for fine-grained resource sharing in the data center, in Proc. 8th USENIX Conf. on Networked Systems Design and Implementation, Boston, MA, USA, 2011, pp. 429-483.
[9]
V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, S. Seth, et al., Apache hadoop YARN: Yet another resource negotiator, in Proc. 4th Annual Symp. Cloud Computing, Santa Clara, CA, USA, 2013, pp. 1-16.
[10]
T. Bonald, L. Massoulié, A. Proutière, and J. Virtamo, A queueing analysis of max-min fairness, proportional fairness and balanced fairness, Queueing Syst., vol. 53, nos. 1&2, pp. 65-84, 2006.
[11]
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Dominant resource fairness: Fair allocation of multiple resource types, in Proc. 8thUSENIX Conf. Networked Systems Design and Implementation, Boston, MA, USA, 2013, pp. 323-336.
[12]
J. A. Hartigan and M. A. Wong, Algorithm AS 136: A K-means clustering algorithm, J. Roy. Stat. Soc. Ser. C (Appl. Stat.), vol. 28, no. 1, pp. 100-108, 1979.
[13]
M. K. Pakhira, A linear time-complexity k-means algorithm using cluster shifting, in Proc. 2014 IEEE Int. Conf. Computational Intelligence and Communication Networks, Bhopal, India, 2014, pp. 1047-1051.
[14]
J. O. Iglesias, L. Murphy, M. De Cauwer, D. Mehta, and B. O’Sullivan, A methodology for online consolidation of tasks through more accurate resource estimations, in Proc. 2014 IEEE ACM 7thInt. Conf. Utility and Cloud Computing, London, UK, 2014, pp. 89-98.
[15]
P. Janus and K. Rzadca, SLO-aware colocation of data center tasks based on instantaneous processor requirements, in Proc. 2017 Symp. Cloud Computing, Santa Clara, CA, USA, 2017, pp. 256-268.
[16]
M. Carvalho, D. A. Menascé, and F. Brasileiro, Capacity planning for IaaS cloud providers offering multiple service classes, Future Gener. Comput. Syst., vol. 77, no. 4, pp. 97-111, 2017.
[17]
Z. Y. Hu, D. S. Li, and D. K. Guo, Balance resource allocation for spark jobs based on prediction of the optimal resources, Tsinghua Science and Technology, vol. 25, no. 4, pp. 487-497, 2020.
[18]
S. Venkataraman, Z. H. Yang, M. Franklin, B. Recht, and I. Stoica, Ernest: Efficient performance prediction for large-scale advanced analytics, in Proc. 13th USENIX Conf. Networked Systems Design and Implementation, Santa Clara, CA, USA, 2016, pp. 363-378.
[19]
Z. D. Bei, Z. B. Yu, H. L. Zhang, W. Xiong, C. Z. Xu, L. Eeckhout, and S. Z. Feng, RFHOC: A random-forest approach to auto-tuning Hadoop’s configuration, IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1470-1483, 2016.
[20]
Z. B. Yu, Z. D. Bei, and X. H. Qian, Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing, in Proc. 23rd ACM Int. Conf. Architectural Support for Programming Languages and Operating Systems, Williamsburg, VA, USA, 2018, pp. 564-577.
[21]
Z. Y. Hu, D. S. Li, D. X. Zhang, and Y. X. Chen, ReLoca: Optimize resource allocation for data-parallel jobs using deep learning, in Proc. IEEE Conf. Computer Communications, Toronto, Canada, 2020, pp. 1163-1171.
[22]
Y. Dong, J. Chen, Y. Tang, J. J. Wu, H. Q. Wang, and E. Q. Zhou, Lazy scheduling based disk energy optimization method, Tsinghua Science and Technology, vol. 25, no. 2, pp. 203-216, 2020.
[23]
W. Zhang, X. Chen, and J. H. Jiang, A multi-objective optimization method of initial virtual machine fault-tolerant placement for star topological data centers of cloud systems, Tsinghua Science and Technology, vol. 26, no. 1, pp. 95-111, 2021.
[24]
D. Shen, J. Z. Luo, F. Dong, and J. X. Zhang, VirtCo: Joint coflow scheduling and virtual machine placement in cloud data centers, Tsinghua Science and Technology, vol. 24, no. 5, pp. 630-644, 2019.
[25]
C. Xue, C. Lin, and J. Hu, Scalability analysis of request scheduling in cloud computing, Tsinghua Science and Technology, vol. 24, no. 3, pp. 249-261, 2019.
Publication history
Copyright
Rights and permissions

Publication history

Received: 16 September 2020
Revised: 28 September 2020
Accepted: 29 September 2020
Published: 29 September 2021
Issue date: April 2022

Copyright

© The author(s) 2022

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return