Journal Home > Volume 22 , Issue 2

The Extreme Learning Machine (ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning (IL) or Big Data (BD) learning. However, they are unable to solve both imbalanced and large-volume data learning problems. This study addresses the IL problem in BD applications. The Distributed and Weighted ELM (DW-ELM) algorithm is proposed, which is based on the MapReduce framework. To confirm the feasibility of parallel computation, first, the fact that matrix multiplication operators are decomposable is illustrated. Then, to further improve the computational efficiency, an Improved DW-ELM algorithm (IDW-ELM) is developed using only one MapReduce job. The successful operations of the proposed DW-ELM and IDW-ELM algorithms are finally validated through experiments.


menu
Abstract
Full text
Outline
About this article

Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning

Show Author's information Zhiqiong WangJunchang Xin( )Hongxu YangShuo TianGe YuChenren XuYudong Yao
Sino-Dutch Biomedical & Information Engineering School, Northeastern University, Shenyang 110169, China.
School of Computer Science & Engineering, Northeastern University, Shenyang 110169, China.
School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China.
Department of Electrical and Computer Engineering, Stevens Institute of Technology, Castle Point on Hudson Hoboken, NJ 07030, USA.

Abstract

The Extreme Learning Machine (ELM) and its variants are effective in many machine learning applications such as Imbalanced Learning (IL) or Big Data (BD) learning. However, they are unable to solve both imbalanced and large-volume data learning problems. This study addresses the IL problem in BD applications. The Distributed and Weighted ELM (DW-ELM) algorithm is proposed, which is based on the MapReduce framework. To confirm the feasibility of parallel computation, first, the fact that matrix multiplication operators are decomposable is illustrated. Then, to further improve the computational efficiency, an Improved DW-ELM algorithm (IDW-ELM) is developed using only one MapReduce job. The successful operations of the proposed DW-ELM and IDW-ELM algorithms are finally validated through experiments.

Keywords: weighted Extreme Learning Machine (ELM), imbalanced big data, MapReduce framework, user-defined counter

References(31)

[1]
Langley P., The changing science of machine learning, Mach. Learn., vol. 82, no. 3, pp. 275-279, 2011.
[2]
Huang G.-B., Zhu Q.-Y., and Siew C.-K., Extreme learning machine: Theory and applications, Neurocomputing, vol. 70, nos. 1–3, pp. 489-501, 2006.
[3]
Huang G.-B., Chen L., and Siew C.-K., Universal approximation using incremental constructive feedforward networks with random hidden nodes, IEEE Trans. Neural Netw., vol. 17, no. 4, pp. 879-892, 2006.
[4]
Huang G.-B. and Chen L., Convex incremental extreme learning machine, Neurocomputing, vol. 70, nos. 16–18, pp. 3056-3062, 2007.
[5]
Huang G.-B. and Chen L., Enhanced random search based incremental extreme learning machine, Neurocomputing, vol. 71, nos. 16–18, pp. 3460-3468, 2008.
[6]
Huang G.-B., Ding X., and Zhou H., Optimization method based extreme learning machine for classification, Neurocomputing, vol. 74, nos. 1–3, pp. 155-163, 2010.
[7]
Huang G.-B., Zhou H., Ding X., and Zhang R., Extreme learning machine for regression and multiclass classification, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 42, no. 2, pp. 513-529, 2012.
[8]
Huang G.-B., Wang D., and Lan Y., Extreme learning machines: A survey, Int. J. Mach. Learn. Cybern., vol. 2, no. 2, pp. 107-122, 2011.
[9]
Zhu Q.-Y., Qin A. K., Suganthan P. N., and Huang G.-B., Evolutionary extreme learning machine, Pattern Recognit., vol. 38, no. 10, pp. 1759-1763, 2005.
[10]
Huang G.-B., Liang N.-Y., Rong H.-J., Saratchandran P., and Sundararajan N., On-line sequential extreme learning machine, in Proc. of the IASTED Int. Conf. on Computational Intelligence, Calgary, Canada, 2005, pp. 232-237.
[11]
Liang N.-Y., Huang G.-B., Saratchandran P., and Sundararajan N., A fast and accurate on-line sequential learning algorithm for feedforward networks, IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1411-1423, 2006.
[12]
Rong H.-J., Huang G.-B., Sundararajan N., and Saratchandran P., On-line sequential fuzzy extreme learning machine for function approximation and classification problems, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 39, no. 4, pp. 1067-1072, 2009.
[13]
He H. and Garcia E. A., Learning from imbalanced data, IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263-1284, 2009.
[14]
Liu X., Wu J., and Zhou Z., Exploratory undersampling for class-imbalance learning, IEEE Trans. Syst. Man Cybern. Part B-Cybern., vol. 39, no.2, pp. 539-550, 2006.
[15]
Han H., Wang W.-Y., and Mao B.-H., Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning, in Proc. of Int. Conf. on Intelligent Computing, Hefei, China, 2005, pp. 878-887.
DOI
[16]
Zong W., Huang G.-B., and Chen Y., Weighted extreme learning machine for imbalance learning, Neurocomputing, vol. 101, no. 3, pp. 229-242, 2013.
[17]
Chen M., Mao S., and Liu Y., Big data: A survey, Mobile Netw. Appl., vol. 19, no. 2, pp. 171-209, 2014.
[18]
Chen J., Chen Y., Du X., Li C., Lu J., Zhao S., and Zhou X., Big data challenge: A data management perspective, Front.. Comput. Sci., vol. 7, no. 2, pp. 157-164, 2013.
[19]
He Q., Shang T., Zhuang F., and Shi Z., Parallel extreme learning machine for regression based on mapreduce, Neurocomputing, vol. 102, no. 2, pp. 52-58, 2013.
[20]
Xin J., Wang Z., Chen C., Ding L., Wang G., and Zhao Y., ELM: Distributed extreme learning machine with mapreduce, World Wide Web, vol. 17, no. 5, pp. 1189-1204, 2014.
[21]
Bi X., Zhao X., Wang G., Zhang P., and Wang C., Distributed extreme learning machine with kernels based on mapreduce, Neurocomputing, vol. 149, no. 1, pp. 456-463, 2015.
[22]
Xin J., Wang Z., Qu L., and Wang G., Elastic extreme learning machine for big data classification, Neurocomputing, vol. 149, no. 1, pp. 464-471, 2015.
[23]
Xin J., Wang Z., Qu L., Yu G., and Kang Y., A-ELM: Adaptive distributed extreme learning machine with MapReduce, Neurocomputing, vol. 174, no. 1, pp. 368-374, 2016.
[24]
Dean J. and Ghemawat S., MapReduce: Simplified data processing on large clusters, in Proc. Symposium on Operating System Design and Implementation, San Francisco, CA, USA, 2004, pp. 137-150.
[25]
Dean J. and Ghemawat S., MapReduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[26]
Dean J. and Ghemawat S., MapReduce: A flexible data processing tool, Commun. ACM, vol. 53, no. 1, pp. 72-77, 2010.
[27]
Chu C.-T., Kim S.-K., Lin Y.-A., Yu Y.-Y., Bradski G., Ng A.-Y., and Olukotun K., Map-reduce for machine learning on multicore, in Proc. 20th Annual Conf. on Neural Information Processing Systems, Vancouver, Canada, 2007, pp. 281-288.
[28]
Meng X. and Mahoney M. W., Robust regression on mapreduce, in Proc. 30th Int. Conf. on Machine Learning, Atlanta, GA, USA, 2013, pp. 888-896.
[29]
Fletcher R., Constrained optimization, in Practical Methods of Optimization. Hoboken N. J., Ed. John Wiley & Sons Ltd, 1981, pp. 127-416.
[30]
Ghemawat S., Gobioff H., and Leung S.-T., The google file system, in Proc. 19th ACM Symposium on Operating Systems Principles, New York, UK, USA, 2003, pp. 29-43.
DOI
[31]
Shvachko K., Kuang H., Radia S., and Chansler R., The hadoop distributed file system, in Proc. 26th IEEE Symposium on Mass Storage Systems and Technologies, Incline Village, NV, USA, 2010, pp. 1-10.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 27 August 2016
Revised: 14 January 2017
Accepted: 18 January 2017
Published: 06 April 2017
Issue date: April 2017

Copyright

© The author(s) 2017

Acknowledgements

This research was partially supported by the National Natural Science Foundation of China (Nos. 61402089, 61472069, and 61501101), the Fundamental Research Funds for the Central Universities (Nos. N161904001, N161602003, and N150408001), the Natural Science Foundation of Liaoning Province (No. 2015020553), the China Postdoctoral Science Foundation (No. 2016M591447), and the Postdoctoral Science Foundation of Northeastern University (No. 20160203).

Rights and permissions

Return