<i>VirtCO</i>: Joint Coflow Scheduling and Virtual Machine Placement in Cloud Data Centers

Dian Shen; Junzhou Luo; Fang Dong; Junxue Zhang

doi:10.26599/TST.2018.9010098

Tsinghua Science and Technology 2019, 24(5): 630-644 https://doi.org/10.26599/TST.2018.9010098

Open Access | Issue | Published: 29 April 2019

VirtCO: Joint Coflow Scheduling and Virtual Machine Placement in Cloud Data Centers

Show Author's Information Hide Author's Information Dian Shen, Junzhou Luo, Fang Dong(

), Junxue Zhang

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.

SING Group, Hong Kong University of Science and Technology, Hong Kong 999077, China.

Keywords:

cloud computing, data center, coflow scheduling, Virtual Machine (VM) placement

Cite this article:

Shen D, Luo J, Dong F, et al. VirtCO: Joint Coflow Scheduling and Virtual Machine Placement in Cloud Data Centers. Tsinghua Science and Technology, 2019, 24(5): 630-644. https://doi.org/10.26599/TST.2018.9010098

Download citation

EndNote(RIS)

BibTeX

432

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Cloud data centers, such as Amazon EC2, host myriad big data applications using Virtual Machines (VMs). As these applications are communication-intensive, optimizing network transfer between VMs is critical to the performance of these applications and network utilization of data centers. Previous studies have addressed this issue by scheduling network flows with coflow semantics or optimizing VM placement with traffic considerations. However, coflow scheduling and VM placement have been conducted orthogonally. In fact, these two mechanisms are mutually dependent, and optimizing these two complementary degrees of freedom independently turns out to be suboptimal. In this paper, we present VirtCO, a practical framework that jointly schedules coflows and places VMs ahead of VM launch to optimize the overall performance of data center applications. We model the joint coflow scheduling and VM placement optimization problem, and propose effective heuristics for solving it. We further implement VirtCO with OpenStack and deploy it in a testbed environment. Extensive evaluation of real-world traces shows that compared with state-of-the-art solutions, VirtCO greatly reduces the average coflow completion time by up to 36.5%. This new framework is also compatible with and readily deployable within existing data center architectures.

Full text

Abstract

Full text

Outline

About this article

VirtCO: Joint Coflow Scheduling and Virtual Machine Placement in Cloud Data Centers

Show Author's information Hide Author's Information Dian Shen, Junzhou Luo, Fang Dong(

), Junxue Zhang

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.

SING Group, Hong Kong University of Science and Technology, Hong Kong 999077, China.

Abstract

Keywords: cloud computing, data center, coflow scheduling, Virtual Machine (VM) placement

References(31)

[1]

Amazon Elastic Compute Cloud, http://aws.amazon.com/ec2/, 2018.

DOI

[2]

J. C. Mogul and L. Popa, What we talk about when we talk about cloud network performance, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’12), Helsinki, Finland, 2012, pp. 44-48.

DOI

[3]

D. Xie, N. Ding, Y. C. Hu, and R. Kompella, The only constant is change: Incorporating time-varying network reservations in data centers, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’12), Helsinki, Finland, 2012, pp. 199-210.

DOI

[4]

M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, Managing data transfers in computer clusters with orchestra, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’12), Toronto, Canada, 2011, pp. 98-109.

DOI

[5]

J. Jiang, S. Ma, B. Li, and B. Li, Symbiosis: Network-aware task scheduling in data-parallel frameworks, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’16), San Francisco, CA, USA, 2016, pp. 1-9.

DOI

[6]

M. Chowdhury, Y. Zhong, and I. Stoica, Efficient coflow scheduling with varys, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’14), Chicago, IL, USA, 2014, pp. 443-454.

DOI

[7]

J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008.

DOI Google Scholar

[8]

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, MJ. Franklin, S. Shenker, and I. Stoica, Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, in Proceedings of the USENIX Symposium on Networked Systems Design and Implementation (NSDI’12), San Jose, CA, USA, 2012, pp. 2-2.

DOI

[9]

G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski, Pregel: A system for large-scale graph processing, in Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’10), Indianapolis, IN, USA, 2010, pp. 135-146.

DOI

[10]

M. Chowdhury and I. Stoica, Efficient coflow scheduling without prior knowledge, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’15), London, UK, 2015, pp. 393-406.

DOI

[11]

Z. Qiu, C. Stein, and Y. Zhong, Minimizing the total weighted completion time of coflows in datacenter networks, in Proceedings of the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’15), Portland, OR, USA, 2015, pp. 294-303.

DOI

[12]

J. Lee, Y. Turner, M. Lee, L. Popa, S. Banerjee, J. Kang, and P. Sharma, Application-driven bandwidth guarantees in datacenters, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’14), Chicago, IL, USA, 2014, pp. 467-478.

DOI

[13]

X. Meng, V. Pappas, and L. Zhang, Improving the scalability of data center networks with traffic-aware virtual machine placement, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’10), San Diego, CA, USA, 2010, pp. 1-9.

DOI

[14]

X. Li, J. Wu, S. Tang, and S. Lu, Let’s stay together: Towards traffic aware virtual machine placement in data centers, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’14), Toronto, Canada, 2014, pp. 1842-1850.

DOI

[15]

Y. Zhao, K. Chen, W. Bai, M. Yu, C. Tian, Y. Geng, Y. Yang, D. Li, and S. Wang, Rapier: Integrating routing and scheduling for coflow-aware data center networks, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’15), Hong Kong, China, 2015, pp. 424-432.

DOI

[16]

V. Jalaparti, P. Bodik, I. Menache, S. Rao, K. Makarychev, and M. Caesar, Network-aware scheduling for data-parallel jobs: Plan when you can, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’15), London, UK, 2015, pp. 407-420.

DOI

[17]

H. Zhang, L. Chen, B. Yi, K. Chen, M Chowdhury, and Y. Geng, Coda: Toward automatically identifying and scheduling coflows in the dark, in Proceedings of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM’16), Florianopolis, Brazil, 2016, pp. 160-173.

DOI

[18]

K. LaCurts, J. C. Mogul, H. Balakrishnan, and Y. Turner, Cicada: Introducing predictive guarantees for cloud networks, in Proceedings of USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’14), Philadelphia, PA, USA, 2014, pp. 14-19.

[19]

J. Perry, H. Balakrishnan, and D. Shah, Flowtune: Flowlet control for datacenter networks. in Proceedings of USENIX Conference on Networked Systems Design and Implementation (NSDI’17), Boston, MA, USA, 2017, pp. 421-435.

[20]

OpenStack Open Source Cloud Computing Software, https://www.openstack.org/, 2018.

[21]

D. Shen, J. Luo, F. Dong, and J. Zhang, Appbag: Application-aware bandwidth allocation for virtual machines in cloud environment, in 45th International Conference on Parallel Processing (ICPP), Philadelphia, PA, USA, 2016, pp. 21-30.

DOI

[22]

L. Chen, W. Cui, B. Li, and B. Li, Optimizing coflow completion times with utility max-min fairness, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’16), San Francisco, CA, USA, 2016, pp. 1755-1763.

DOI

[23]

Y. Lu, Sed: An SDN-based explicit-deadline-aware TCP for cloud data center networks, Tsinghua Science and Technology, vol. 21, no. 5, pp. 491-499, 2016.

DOI Google Scholar

[24]

F. Ahmad, S. T. Chakradhar, A. Raghunathan, and T. N. Vijaykumar, Shufflewatcher: Shuffle-aware scheduling in multi-tenant MapReduce clusters, in Proceedings of USENIX Annual Technical Conference (ATC’14), Philadelphia, PA, USA, 2014, pp. 1-12.

[25]

A. Munir, T. He, R. Raghavendra, F. Li, and A. X. Liu, Network scheduling aware task placement in datacenters, in Proceedings of the International Conference on Emerging Networking Experiments and Technologies (CoNEXT’16), Irvine, CA, USA, 2016, pp. 221-235.

DOI

[26]

Y. Zhao, Y. Huang, K. Chen, M. Yu, S. Wang, and D. S. Li, Joint VM placement and topology optimization for traffic scalability in dynamic datacenter networks, Computer Networks, vol. 80, pp. 109-123, 2015.

DOI Google Scholar

[27]

H. Wang, Y. Li, Y. Zhang, and D. Jin, Virtual machine migration planning in software-defined networks, in Proceedings of IEEE Conference on Computer Communications (INFOCOM’15), Hong Kong, China, 2015, pp. 487-495.

DOI

[28]

J. Li, D. Li, Y. Ye, and X. Lu, Efficient multi-tenant virtual machine allocation in cloud data centers, Tsinghua Science and Technology, vol. 20, no. 1, pp. 81-89, 2015.

DOI Google Scholar

[29]

K. Ousterhout, R. Rasti, S. Ratnasamy, S. Shenker, and B. G. Chun, Making sense of performance in data analytics frameworks, in Proceedings of USENIX Conference on Networked Systems Design and Implementation (NSDI’15), Oakland, CA, USA, 2015, pp. 293-307.

[30]

A. Trivedi, P. Stuedi, J. Pfefferle, R. Stoica, B. Metzler, I. Koltsidas, and N. Ioannou, On the [ir] relevance of network performance for data processing, in Proceedings of USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’16), Denver, CO, USA, 2016, pp. 126-131.

[31]

J. Zhang, J. Chen, J. Luo, and A. Song, Efficient location-aware data placement for data-intensive applications in geo-distributed scientific data centers, Tsinghua Science and Technology, vol. 21, no. 5, pp. 471-481, 2016.

DOI Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 02 April 2018

Accepted: 01 May 2018

Published: 29 April 2019

Issue date: October 2019

Copyright

Acknowledgements

This work was supported by the National Key R&D Program of China (No. 2017YFB1003000), the National Natural Science Foundation of China (Nos. 61572129, 61602112, 61502097, 61702096, 61320106007, and 61632008), the International S&T Cooperation Program of China (No. 2015DFA10490), the National Science Foundation of Jiangsu Province (Nos. BK20160695 and BK20170689), the Jiangsu Provincial Key Laboratory of Network and Information Security (No. BM2003201), the Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (No. 93K-9), and partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization and Collaborative Innovation Center of Wireless Communications Technology.