Journal Home > Volume 1 , issue 4

Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules, and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase’s parameters can be tuned to improve the efficiency of our system.


menu
Abstract
Full text
Outline
About this article

Distributed Storage System for Electric Power Data Based on HBase

Show Author's information Jiahui JinAibo Song( )Huan GongYingying XueMingyang DuFang DongJunzhou Luo
School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.

Abstract

Managing massive electric power data is a typical big data application because electric power systems generate millions or billions of status, debugging, and error records every single day. To guarantee the safety and sustainability of electric power systems, massive electric power data need to be processed and analyzed quickly to make real-time decisions. Traditional solutions typically use relational databases to manage electric power data. However, relational databases cannot efficiently process and analyze massive electric power data when the data size increases significantly. In this paper, we show how electric power data can be managed by using HBase, a distributed database maintained by Apache. Our system consists of clients, HBase database, status monitors, data migration modules, and data fragmentation modules. We evaluate the performance of our system through a series of experiments. We also show how HBase’s parameters can be tuned to improve the efficiency of our system.

Keywords:

electric power data, HBase, data storage
Received: 20 August 2017 Accepted: 26 March 2018 Published: 02 July 2018 Issue date: December 2018
References(29)
[1]
S. Y. Pan, T. Morris, and U. Adhikari, Developing a hybrid intrusion detection system using data mining for power systems, IEEE Trans. Smart Grid, vol. 6, no. 6, pp. 3104-3113, 2015.
[2]
H. Jiang, K. Wang, Y. H. Wang, M. Gao, and Y. Zhang, Energy big data: A survey, IEEE Access, vol. 4, pp. 3844-3861, 2016.
[3]
M. Yigit, V. C. Gungor, and S. Baktir, Cloud computing for smart grid applications, Comput. Netw., vol. 70, pp. 312-329, 2014.
[4]
S. Ghemawat, H. Gobioff, and S. T. Leung, The Google file system, ACM SIGOPS Ope. Syst. Rev., vol. 37, no. 5, pp. 29-43, 2003.
[5]
Welcome to ApacheTM Hadoop® , 2017.
[6]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, Bigtable: A distributed storage system for structured data, in Proc. 7th USENIX Symp. Operating Systems Design and Implementation, Seattle, WA, USA, 2006, pp. 205-218.
[7]
Apache HBase-Apache HBaseTM Home, , 2017.
[8]
J. Dean and S. Ghemawat, Mapreduce: Simplified data processing on large clusters, Commun. ACM, vol. 51, no. 1, pp. 107-113, 2008.
[9]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, Spark: Cluster computing with working sets, in Proc. 2nd USENIX Conf. Hot Topics in Cloud Computing, Boston, MA, USA, 2010, p. 10.
[10]
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, in Proc. 9th USENIX Conf. Networked Systems Design and Implementation, NSDI’12, San Jose, CA, USA, 2012, p. 2.
[11]
S. Kawasoe, Y. Igarashi, K. Shibayama, Y. Nagashima, and S. Nagashima, Examples of distributed information platforms constructed by power utilities in Japan, in Proc. CIGRE Symp. 2012, Paris, France, 2012, pp. 108-113.
[12]
T, Harter, D. Borthakur, S. Y. Dong, A. Aiyer, L. Y. Tang, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, Analysis of HDFS under HBase: A facebook messages case study, in Proc. 12th USENIX Conf. File and Storage Technologies, Santa Clara, CA, USA, 2014, pp. 199-212.
[13]
Ganglia monitoring system, , 2017.
[14]
D. Lasalle and G. Karypis, Multi-threaded graph partitioning, in Proc. 27th IEEE Int. Symp. Parallel & Distributed Processing, Boston, MA, USA, 2013, pp. 225-236.
[15]
H. Lyu, P. Li, Y. N. Xiao, H. J. Qian, B. Sheng, and R. M. Shen, Mass data storage platform for smart grid, in Proc. 2016 Int. Conf. Progress in Informatics and Computing (PIC), Shanghai, China, 2016, pp. 530-535.
[16]
S. Rusitschka, K. Eger, and C. Gerdes, Smart grid data cloud: A model for utilizing cloud computing in the smart grid domain, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 483-488.
[17]
W. Medjroubi, U. P. Müller, M. Scharf, C. Matke, and D. Kleinhans, Open data in power grid modelling: New approaches towards transparent grid models, Energy Rep., vol. 3, pp. 14-21, 2017.
[18]
R. Meier, E. Cotilla-Sanchez, B. McCamish, D. Chiu, M. Histand, J. Landford, and R. B. Bass, Power system data management and analysis using synchrophasor data, in Proc. 2014 IEEE Conf. Technologies for Sustainability (SusTech), Portland, OR, USA, 2014, pp. 225-231.
[19]
A. Bose, Smart transmission grid applications and their supporting infrastructure, IEEE Trans. Smart Grid, vol. 1, no. 1, pp. 11-19, 2010.
[20]
T. Niimura, M. Dhaliwal, and K. Ozawa, Fuzzy regression models to represent electricity market data in deregulated power industry, in Proc. Joint 9th IFSA World Congress and 20th NAFIPS Int. Conf., Vancouver, Canada, 2001, pp. 2556-2561.
[21]
Z. J. Fu, X. M. Sun, Q. Liu, L. Zhou, and J. G. Shu, Achieving efficient cloud search services: Multi-keyword ranked search over encrypted cloud data supporting parallel computing, IEICE Trans. Commun., vol. 98, no. 1, pp. 190-200, 2015.
[22]
X. He, Q. Ai, R. C. Qiu, W. T. Huang, L. J. Piao, and H. C. Liu, A big data architecture design for smart grids based on random matrix theory, IEEE Trans. Smart Grid, vol. 8, no. 2, pp. 674-686, 2017.
[23]
S. Ruj and A. Nayak, A decentralized security framework for data aggregation and access control in smart grids, IEEE Trans. Smart Grid, vol. 4, no. 1, pp. 196-205, 2013.
[24]
Y. Yan, Y. Qian, and H. Sharif, A secure data aggregation and dispatch scheme for home area networks in smart grid, in Proc. 2011 IEEE Global Telecommunications Conf., Kathmandu, Nepal, 2011, pp. 1-6.
[25]
F. J. Li, B. Luo, and P. Liu, Secure information aggregation for smart grids using homomorphic encryption, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 327-332.
[26]
G. Kalogridis, C. Efthymiou, S. Z. Denic, T. A. Lewis, and R. Cepeda, Privacy for smart meters: Towards undetectable appliance load signatures, in Proc. 1st IEEE Int. Conf. Smart Grid Communications, Gaithersburg, MD, USA, 2010, pp. 232-237.
[27]
V. Rastogi and S. Nath, Differentially private aggregation of distributed time-series with transformation and encryption, in Proc. 2010 ACM SIGMOD Int. Conf. Management of Data, Indianapolis, IN, USA, 2010, pp. 735-746.
[28]
R. Tan, V. B. Krishna, D. K. Y. Yau, and Z. Kalbarczyk, Impact of integrity attacks on real-time pricing in smart grids, in Proc. 2013 ACM SIGSAC Conf. Computer & Communications Security, Berlin, Germany, 2013, pp. 439-450.
[29]
S. Tan, W. Z. Song, M. Stewart, J. J. Yang, and L. Tong, Online data integrity attacks against real-time electrical market in smart grid, IEEE Trans. Smart Grid, vol. 9, no. 1, pp. 313-322, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 20 August 2017
Accepted: 26 March 2018
Published: 02 July 2018
Issue date: December 2018

Copyright

© The author(s) 2018

Acknowledgements

This work was supported by the National Key R&D Program of China (No. 2017YFB1003000); the National Natural Science Foundation of China (Nos. 61702096, 61572129, 61602112, 61502097, 61320106007, 61632008, and 61702097); the International S&T Cooperation Program of China (No. 2015DFA10490); the Natural Science Foundation of Jiangsu Province (Nos. BK20170689 and BK20160695); the Jiangsu Provincial Key Laboratory of Network and Information Security (No. BM2003201); the Key Laboratory of Computer Network and Information Integration of Ministry of Education of China (No. 93K-9); and the SGCC Science and Technology Program "the Distributed Data Management of Physical Distribution and Logical Integration" ; and was partially supported by the Collaborative Innovation Center of Novel Software Technology and Industrialization and Collaborative Innovation Center of Wireless Communications Technology.

Rights and permissions

Reprints and Permission requests may be sought directly from editorial office.

Return