Journal Home > Volume 25 , Issue 4

Apache Spark provides a well-known MapReduce computing framework, aiming to fast-process big data analytics in data-parallel manners. With this platform, large input data are divided into data partitions. Each data partition is processed by multiple computation tasks concurrently. Outputs of these computation tasks are transferred among multiple computers via the network. However, such a distributed computing framework suffers from system overheads, inevitably caused by communication and disk I/O operations. System overheads take up a large proportion of the Job Completion Time (JCT). We observed that excessive computational resources incurs considerable system overheads, prolonging the JCT. The over-allocation of individual jobs not only prolongs their own JCTs, but also likely makes other jobs suffer from under-allocation. Thus, the average JCT is suboptimal, too. To address this problem, we propose a prediction model to estimate the changing JCT of a single Spark job. With the support of the prediction method, we designed a heuristic algorithm to balance the resource allocation of multiple Spark jobs, aiming to minimize the average JCT in multiple-job cases. We implemented the prediction model and resource allocation method in ReB, a Resource-Balancer based on Apache Spark. Experimental results showed that ReB significantly outperformed the traditional max-min fairness and shortest-job-optimal methods. The average JCT was decreased by around 10%-30% compared to the existing solutions.


menu
Abstract
Full text
Outline
About this article

Balance Resource Allocation for Spark Jobs Based on Prediction of the Optimal Resource

Show Author's information Zhiyao HuDongsheng Li( )Deke Guo
College of Computer, National University of Defense Technology, Changsha 410073, China.
College of System Engineering, National University of Defense Technology, Changsha 410073, China.

Abstract

Apache Spark provides a well-known MapReduce computing framework, aiming to fast-process big data analytics in data-parallel manners. With this platform, large input data are divided into data partitions. Each data partition is processed by multiple computation tasks concurrently. Outputs of these computation tasks are transferred among multiple computers via the network. However, such a distributed computing framework suffers from system overheads, inevitably caused by communication and disk I/O operations. System overheads take up a large proportion of the Job Completion Time (JCT). We observed that excessive computational resources incurs considerable system overheads, prolonging the JCT. The over-allocation of individual jobs not only prolongs their own JCTs, but also likely makes other jobs suffer from under-allocation. Thus, the average JCT is suboptimal, too. To address this problem, we propose a prediction model to estimate the changing JCT of a single Spark job. With the support of the prediction method, we designed a heuristic algorithm to balance the resource allocation of multiple Spark jobs, aiming to minimize the average JCT in multiple-job cases. We implemented the prediction model and resource allocation method in ReB, a Resource-Balancer based on Apache Spark. Experimental results showed that ReB significantly outperformed the traditional max-min fairness and shortest-job-optimal methods. The average JCT was decreased by around 10%-30% compared to the existing solutions.

Keywords: Spark jobs, resource over-allocation, performance prediction

References(23)

[1]
C. Mosharaf, Z. Matei, M. Justin, J. Michael, and S. Ion, Managing data transfers in computer clusters with Orchestra, ACM SIGCOMM Computer Communication Review, vol. 41, no. 4, pp. 98-109, 2011.
[2]
J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West, Scale and performance in a distributed file system, ACM Transaction on Computer System, vol. 6, no. 1, pp. 51-81, 1988.
[3]
D. P. Woodruff and Q. Zhang, When distributed computation is communication expensive, Distributed Computing, vol. 30, no. 5, pp. 309-323, 2017.
[4]
J. Dean and S. Ghemawat, MapReduce: Simplified data processing on large clusters, in Proceedings of USENIX Symposium on Operating System Design and Implementation (OSDI’04), San Francisco, CA, USA, 2004, pp. 137-150.
[5]
Z. Shen, S. Subbiah, X. Gu, and J. Wilkes, Cloudscale: Elastic resource scaling for multi-tenant cloud systems, in Proceedings of ACM Symposium on Cloud Computing (SOCC’11), Cascais, Portugal, 2011, pp. 5-17.
DOI
[6]
C. Delimitrou and C. Kozyrakis, Quasar: Resource-efficient and QoS-aware cluster management, in Proceedings of ACM Architectural Support for Programming Languages and Operating Systems (ASPLOS’14), Salt Lake City, UT, USA, 2014, pp. 127-144.
DOI
[7]
B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica, Mesos: A platform for fine-grained resource sharing in the data center, in Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI’11), Boston, MA, USA, 2011, pp. 429-483.
[8]
V. K. Vavilapalli, A. C. Murthy, C. Douglas, S. Agarwal, M. Konar, R. Evans, T. Graves, J. Lowe, H. Shah, and S. Seth, Apache Hadoop YARN: Yet another resource negotiator, in Proceedings of ACM Symposium on Cloud Computing (SOCC’13), Santa Clara, CA, USA, 2013, pp. 1-16.
DOI
[9]
H. Mao, M. Alizadeh, I. Menache, and S. Kandula, Resource management with deep reinforcement learning, in Proceedings of ACM HotNet Workshop on Hot Topics in Networks (HotNet’16), Atlanta, GA, USA, 2016, pp. 50-56.
DOI
[10]
T. Bonald, L. Massouli, A. Prouti, and J. T. Virtamo, A queueing analysis of max-min fairness, proportional fairness and balanced fairness, Queueing Systems, vol. 53, nos. 1&2, pp. 65-84, 2006.
[11]
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Dominant resource fairness: Fair allocation of multiple resource types, in Proceedings of USENIX Symposium on Networked Systems Design and Implementation (NSDI’13), Boston, MA, USA, 2013, pp. 323-336.
[12]
M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, Delay scheduling: A simple technique for achieving locality and fairness in cluster scheduling, in Proceedings of European Conference on Computer Systems (EuroSys’10), Paris, France, 2010, pp. 265-278.
DOI
[13]
R. Grandl, G. Ananthanarayanan, S. Kandula, S. Rao, and A. Akella, Multi-resource packing for cluster schedulers, in Proceedings of ACM Special Interest Group on Data Communication (SIGCOMM’14), Chicago, IL, USA, 2014, pp. 455-466.
DOI
[14]
S. Venkataraman, Z. Yang, M. J. Franklin, B. Recht, and I. Stoica, Ernest: Efficient performance prediction for large-scale advanced analytics, in Proceedings of USENIX Symposium on Networked Systems Design and Implementatio (NSDI’16), Santa Clara, CA, USA, 2016, pp. 363-378.
[15]
Z. Bei, Z. Yu, H. Zhang, W. Xiong, C. Xu, L. Eeckhout, and S. Feng, RFHOC: A random-forest approach to auto-tuning Hadoop’s configuration, IEEE Transaction on Parallel and Distributed Systems, vol. 27, no. 5, pp. 1470-1483, 2016.
[16]
Z. Yu, Z. Bei, and X. Qian, Datasize-aware high dimensional configurations auto-tuning of in-memory cluster computing, in Proceedings of ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’18), Williamsburg, VA, USA, 2018, pp. 564-577.
DOI
[17]
C. Re, D. Agrawal, M. Balazinska, M. J. Cafarella, M. I. Jordan, T. Kraska, and R. Ramakrishnan, Machine learning and databases: The sound of things to come or a cacophony of hype? in Proceedings of ACM International Conference on Management of Data (SIGMOD’15), Melbourne, Australia, 2015, pp. 283-284.
DOI
[18]
L. Sun, S. Sun, T. Wang, J. Li, and J. Lin, Parallel ADR detection based on spark and BCPNN, Tsinghua Science and Technology, vol. 24, no. 2, pp. 195-206, 2019.
[19]
X. Ye, X. Chen, D. Liu, W. Wang, L. Yang, G. Liang, and G. Shao, Notice of retraction: Efficient feature extraction using Apache Spark for network behavior anomaly detection, Tsinghua Science and Technology, vol. 23, no. 5, pp. 561-573, 2018.
[20]
M. Wang, Y. Cui, S. Xiao, X. Wang, D. Yang, K. Chen, and J. Zhu, Neural network meets DCN: Traffic-driven topology adaptation with deep learning, in Proceedings of ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’18), Irvine, CA, USA, 2018, pp. 97-99.
DOI
[21]
N. Yamashita and M. Fukushima, On the rate of convergence of the Levenberg-Marquardt method, Springer Computing, vol. 15, pp. 239-249, 2001.
[22]
R. Grandl, M. Chowdhury, A. Akella, and G. Ananthanarayanan, Altruistic scheduling in multi-resource clusters, in Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI’16), Savannah, GA, USA, 2016, pp. 65-80.
[23]
D. Fooladivanda, A. A. Daoud, and C. Rosenberg, Joint channel allocation and user association for heterogeneous wireless cellular networks, IEEE Transaction on Wireless Communications, vol. 12, no. 1, pp. 248-257, 2011.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 30 March 2019
Revised: 24 July 2019
Accepted: 09 September 2019
Published: 13 January 2020
Issue date: August 2020

Copyright

© The author(s) 2020

Acknowledgements

This work was supported in part by the National Key R&D Program of China (No. 2018YFB2101100), the National Natural Science Foundation of China (Nos. 61932001 and 61872376), and Hunan Provincial Innovation Foundation For Postgraduate.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return