Journal Home > Volume 18 , Issue 1

The Kepler General Purpose GPU (GPGPU) architecture was developed to directly support GPU virtualization and make GPGPU cloud computing more broadly applicable by providing general purpose computing capability in the form of on-demand virtual resources. This paper describes a baseline GPGPU cloud system built on Kepler GPUs, for the purpose of exploring hardware potential while improving task performance. This paper elaborates a general scheme which defines the whole cloud system into a cloud layer, a server layer, and a GPGPU layer. This paper also illustrates the hardware features, task features, scheduling mechanism, and execution mechanism of each layer. Thus, this paper provides a better understanding of general-purpose computing on a GPGPU cloud.


menu
Abstract
Full text
Outline
About this article

GPGPU Cloud: A Paradigm for General Purpose Computing

Show Author's information Liang HuXilong Che( )Zhenzhen Xie
College of Computer Science and Technology, Jilin University, Changchun 130012, China

Abstract

The Kepler General Purpose GPU (GPGPU) architecture was developed to directly support GPU virtualization and make GPGPU cloud computing more broadly applicable by providing general purpose computing capability in the form of on-demand virtual resources. This paper describes a baseline GPGPU cloud system built on Kepler GPUs, for the purpose of exploring hardware potential while improving task performance. This paper elaborates a general scheme which defines the whole cloud system into a cloud layer, a server layer, and a GPGPU layer. This paper also illustrates the hardware features, task features, scheduling mechanism, and execution mechanism of each layer. Thus, this paper provides a better understanding of general-purpose computing on a GPGPU cloud.

Keywords: Kepler, GK110, GPGPU cloud, virtualization, SMX

References(23)

[1]
J. Nickolls and W. J. Dally, The GPU computing era, IEEE Micro., vol. 30, no. 2, pp. 56-69, March-April 2010.
[2]
M. Leinweber, L. Baumgartner, M. Mernberger, T. Fober, E. Hullermeier, G. Klebe, and B. Freisleben, GPU-based cloud computing for comparing the structure of protein binding sites, in Proc. of the 6th IEEE International Conference on Digital Ecosystems Technologies (DEST’12), Campione d’Italia, Italy, 18-20 June, 2012, pp. 1-6.
DOI
[3]
T. Nishiyama, S. Yamagiwa, and T. Hisamitsu, Prototyping GPU-based cloud system for IODP core image database, in Proc. of the 2011 Second International Conference on Networking and Computing (ICNC ’11), Osaka, Japan, November 30-December 02, 2011, pp. 327-331.
DOI
[4]
K. Wang and Z. Shen, Artificial societies and GPU-based cloud computing for intelligent transportation management, IEEE Intelligent Systems, vol. 26, no. 4, pp. 22-28, July 2011.
[5]
G. Giunta, R. Montella, G. Laccetti, F. Isaila, and J. G. Blas, A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory, in Advances in Grid Computing, Z Constantinescu, Ed. InTech, Feb. 28, 2011, Chapter 7, pp. 121-146.
DOI
[6]
J. Duato, A. J. Pena, F. Silla, J. C. Fernandez, R. Mayo, and E. S. Quintana-Orti, Enabling CUDA acceleration within virtual machines using rCUDA, in Proc. of the 18th International Conference on High Performance Computing (HiPC 2011), Bengaluru, India, December 18-21, 2011, pp. 1-10.
DOI
[7]
L. Shi, H. Chen, J. Sun, and K. Li, vCUDA: GPU-accelerated high-performance computing in virtual machines, IEEE Transactions on Computers, vol. 61, no. 6, pp. 804-816, June 2012.
[8]
V. Gupta, A. Gavrilovska, K. Schwan, H. Kharche, N. Tolia, V. Talwar, and P. Ranganathan, GViM: GPU-accelerated virtual machines, in Proc. of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing (HPCVirt 09), Nuremberg, Germany, March 31, 2009, pp. 17-24.
DOI
[9]
C. Yang, C. Huang, and C. Lin, Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters, Computer Physics Communications, vol. 182, no. 1, pp. 266-269, January 2011.
[10]
C. Yang, C. Huang, C. Lin, and T. Chang, Hybrid parallel programming on GPU clusters, in Proc. of the IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2010), Taipei, China, September 6-9, 2010, pp. 142-147.
DOI
[11]
NVIDIA Corporation, Whitepaper: NVIDIA’s next generation CUDA compute architecture: Kepler GK110, http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf, 2012.
[12]
K. Fatahalian and M. Houston, A closer look at GPUs, Communications of the ACM, vol. 51, no. 10, pp. 50-57, October 2008.
[13]
D. Luebke and G. Humphreys, How GPUs work, Computer, vol. 40, no. 2, pp. 96-100, February 2007.
[14]
V. V. Kindratenko, J. J. Enos, G. Shi, M. T. Showerman, G. W. Arnold, J. E. Stone, J. C. Phillips, and W. Hwu, GPU clusters for high-performance computing, in Proc. of the 2009 IEEE International Conference on Cluster Computing (CLUSTER 2009), New Orleans, Louisiana, USA, August 31-September 4, 2009, pp. 1-8.
DOI
[15]
Mellanox Technologies Inc. Introduction to InfiniBand, http://www.mellanox.com/pdf/whitepapers/ IB_Intro_WP_190.pdf, 2012.
[16]
Message Passing Interface Project, http://www.mcs.anl.gov/research/projects/mpi/, 2012.
[17]
NVIDIA Corporation, NVIDIA CUDA C Programming Guide, Version 5.0, http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf, 2012.
[18]
NVIDIA Corporation, The CUDA compiler driver NVCC, Version 5.0, http://docs.nvidia.com/cuda/pdf/CUDA_Compiler_ Driver_NVCC.pdf, 2012.
[19]
F. R. Diard, Programming multiple chips from a command buffer (Assignee NVIDIA Corp.), US Patent US7528836B2, May 5, 2009.
[20]
NVIDIA Corporation. Whitepaper: NVIDIA’s next generation CUDA compute architecture: Fermi, http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIA_Fermi_Compute_Architecture_Whitepaper.pdf, 2012.
[21]
J. F. Duluk Jr., S. D. Lew, and J. R. Nickolls, Counter-based delay of dependent thread group execution (Assignee NVIDIA Corp.), US Patent US7526634B1, April 28, 2009.
[22]
P. C. Mills, J. E. Lindholm, B. W. Coon, G. M. Tarolli, and J. M. Burgess, Scheduler in multi-threaded processor prioritizing instructions passing qualification rule (Assignee NVIDIA Corp.), US Patent US7949855B1, May 24, 2011.
[23]
J. F. Duluk Jr., Predicated launching of compute thread arrays (Assignee NVIDIA Corp.), US Patent US7697007B1, April 13, 2010.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 10 December 2012
Accepted: 28 December 2012
Published: 07 February 2013
Issue date: February 2013

Copyright

© The author(s) 2013

Acknowledgements

This work was funded by the European Framework Programme (FP7) (No. FP7-PEOPLE-2011-IRSES), the National Natural Science Foundation of China (Nos. 61073009 and 60873235), the Science-Technology Development Key Project of Jilin Province of China (No. 20080318), and the National High-Tech Research and Development Program (863) of China (No. 2011AA010101).

Rights and permissions

Return