Journal Home > Volume 18 , Issue 1

MapReduce is a very popular parallel programming model for cloud computing platforms, and has become an effective method for processing massive data by using a cluster of computers. X-to-MapReduce (X is a program language) translator is a possible solution to help traditional programmers easily deploy an application to cloud systems through translating sequential codes to MapReduce codes. Recently, some SQL-to-MapReduce translators emerge to translate SQL-like queries to MapReduce codes and have good performance in cloud systems. However, SQL-to-MapReduce translators mainly focus on SQL-like queries, but not on numerical computation. Matlab is a high-level language and interactive environment for numerical computation, visualization, and programming, which is very popular in engineering. We propose and develop a simple Matlab-to-MapReduce translator for cloud computing, called M2M, for basic numerical computations. M2M can translate a Matlab code with up to 100 commands to MapReduce code in few seconds, which may cost a proficient Hadoop MapReduce programmer some days on coding so many commands. In addition, M2M can also recognize the dependency between complex commands, which is always confusing during hand coding. We implemented M2M with evaluation for Matlab commands on a cluster. Several common commands are used in our experiments. The results show that M2M is comparable in performance with hand-coded programs.


menu
Abstract
Full text
Outline
About this article

M2M: A Simple Matlab-to-MapReduce Translator for Cloud Computing

Show Author's information Junbo ZhangDong XiangTianrui LiYi Pan( )
School of Information Science and Technology, Southwest Jiaotong University, Chengdu 610031, China
School of Software, Tsinghua University, Beijing 100084, China
Department of Computer Science, Georgia State University, Atlanta, GA 30303, USA

Abstract

MapReduce is a very popular parallel programming model for cloud computing platforms, and has become an effective method for processing massive data by using a cluster of computers. X-to-MapReduce (X is a program language) translator is a possible solution to help traditional programmers easily deploy an application to cloud systems through translating sequential codes to MapReduce codes. Recently, some SQL-to-MapReduce translators emerge to translate SQL-like queries to MapReduce codes and have good performance in cloud systems. However, SQL-to-MapReduce translators mainly focus on SQL-like queries, but not on numerical computation. Matlab is a high-level language and interactive environment for numerical computation, visualization, and programming, which is very popular in engineering. We propose and develop a simple Matlab-to-MapReduce translator for cloud computing, called M2M, for basic numerical computations. M2M can translate a Matlab code with up to 100 commands to MapReduce code in few seconds, which may cost a proficient Hadoop MapReduce programmer some days on coding so many commands. In addition, M2M can also recognize the dependency between complex commands, which is always confusing during hand coding. We implemented M2M with evaluation for Matlab commands on a cluster. Several common commands are used in our experiments. The results show that M2M is comparable in performance with hand-coded programs.

Keywords: cloud computing, MapReduce, Matlab, translator

References(16)

[1]
J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Communications of the ACM, vol. 51, no. 1, pp. 107-113, Jan. 2008.
[2]
T. White, Hadoop: The Definitive Guide, 2nd ed. O’Reilly Media / Yahoo Press, 2010.
[3]
J. Talbot, R. M. Yoo, and C. Kozyrakis, Phoenix++: Modular mapreduce for shared-memory systems, in Proc. of the Second International Workshop on MapReduce and Its Applications, New York, NY, USA: ACM, 2011, pp. 9-16.
DOI
[4]
B. He, W. Fang, Q. Luo, N. K. Govindaraju, and T. Wang, Mars: A mapreduce framework on graphics processors, in Proc. of the 17th International Conference on Parallel Architectures and Compilation Techniques, New York, NY, USA: ACM, 2008, pp. 260-269.
DOI
[5]
J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox, Twister: A runtime for iterative mapreduce, in Proc. of the 19th ACM Int. Symposium on High Performance Distributed Computing, New York, NY, USA: ACM, 2010, pp. 810-818.
DOI
[6]
T. Gunarathne, B. Zhang, T.-L. Wu, and J. Qiu, Portable parallel programming on cloud and hpc: Scientific applications of twister4azure, in Utility and Cloud Computing (UCC), 2011 Fourth IEEE Int. Conf. on, Dec. 2011, pp. 97-104.
DOI
[7]
Y. Pan and J. Zhang, Parallel programming on cloud computing platforms: Challenges and solutions, KITCS/FTRA Journal of Convergence, vol. 3, no. 4, pp. 23-28, Dec. 2012.
[8]
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff, and R. Murthy, Hive: A warehousing solution over a map-reduce framework, Proc. VLDB Endow., vol. 2, no. 2, pp. 1626-1629, Aug. 2009.
[9]
R. Lee, T. Luo, Y. Huai, F. Wang, Y. He, and X. Zhang, Ysmart: Yet another sql-to-mapreduce translator, in Distributed Computing Systems (ICDCS), 2011 31st Int. Conf. on, June 2011, pp. 25-36.
DOI
[10]
A. Gilat, MATLAB: An Introduction with Applications, 4th ed. John Wiley & Sons, 2011.
[11]
Amazon Elastic MapReduce, http://aws.amazon.com/elasticmapreduce/, 2012.
[12]
HadoopOnAzure, https://www.hadooponazure.com/, 2012.
[13]
M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, Improving mapreduce performance in heterogeneous environments, in Proc. of the 8th USENIX Conf. on Operating Systems Design and Implementation, Berkeley, CA, USA: USENIX Association, 2008, pp. 29-42.
[14]
J. Polo, D. Carrera, Y. Becerra, J. Torres, E. Ayguadé, M. Steinder, and I. Whalley, Performance-driven task co-scheduling for mapreduce environments, in Network Operations and Management Symposium (NOMS), 2010 IEEE, April 2010, pp. 373-380.
DOI
[15]
K. Kc and K. Anyanwu, Scheduling hadoop jobs to meet deadlines, in Cloud Computing Technology and Science (CloudCom), 2010 IEEE Second Int. Conf. on, Dec. 2010, pp. 388-392.
DOI
[16]
S. Owen, R. Anil, T. Dunning, and E. Friedman, Mahout in Action. Greenwich, CT, USA: Manning Publications Co., 2011.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 29 October 2012
Accepted: 20 December 2012
Published: 07 February 2013
Issue date: February 2013

Copyright

© The author(s) 2013

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China (Nos. 61175047, 61100117, and 61202043) and the US National Science Foundation (No. OCI-1156733).

Rights and permissions

Return