Energy Procurement and Retail Pricing for Electricity Retailers via Deep Reinforcement Learning with Long Short-term Memory

Hongsheng Xu; Jinyu Wen; Qinran Hu; Jiao Shu; Jixiang Lu; Zhihong Yang

doi:10.17775/CSEEJPES.2021.04330

CSEE Journal of Power and Energy Systems 2022, 8(5): 1338-1351 https://doi.org/10.17775/CSEEJPES.2021.04330

Open Access | Issue | Published: 14 February 2022

Energy Procurement and Retail Pricing for Electricity Retailers via Deep Reinforcement Learning with Long Short-term Memory

Show Author's Information Hide Author's Information Hongsheng Xu

(

), Jinyu Wen, Qinran Hu, Jiao Shu, Jixiang Lu, Zhihong Yang

State Key Laboratory of Advanced Electromagnetic Engineering and Technology, School of Electrical and Electronic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

NARI Group Corporation, Nanjing 211106, China

School of Electrical Engineering, Southeast University, Nanjing 210096, China

State Key Laboratory of Smart Grid Protection and Control, Nanjing 211106, China

Keywords:

Deep reinforcement learning, long short-term memory, electricity market, energy procurement, retail pricing

Cite this article:

Xu H, Wen J, Hu Q, et al. Energy Procurement and Retail Pricing for Electricity Retailers via Deep Reinforcement Learning with Long Short-term Memory. CSEE Journal of Power and Energy Systems, 2022, 8(5): 1338-1351. https://doi.org/10.17775/CSEEJPES.2021.04330

Download citation

EndNote(RIS)

BibTeX

542

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

The joint optimization problem of energy procurement and retail pricing for an electricity retailer is converted into separately determining the optimal procurement strategy and optimal pricing strategy, under the "price-taker" assumption. The aggregate energy consumption of end use customers (EUCs) is predicted to solve for the optimal procurement strategy vis a long short-term memory (LSTM)-based supervised learning method. The optimal retail pricing problem is formulated as a Markov decision process (MDP), which can be solved by using deep reinforcement learning (DRL) algorithms. However, the performance of existing DRL approaches may deteriorate due to their insufficient ability to extract discriminative features from the time-series vectors in the environmental states. We propose a novel deep deterministic policy gradient (DDPG) network structure with a shared LSTM-based representation network that fully exploits the Actor’s and Critic’s losses. The designed shared representation network and the joint loss function can enhance the environment perception capability of the proposed approach and further improve the optimization performance, resulting in a more profitable pricing strategy. Numerical simulations demonstrate the effectiveness of the proposed approach.

Full text

Abstract

Full text

Outline

About this article

Energy Procurement and Retail Pricing for Electricity Retailers via Deep Reinforcement Learning with Long Short-term Memory

Show Author's information Hide Author's Information Hongsheng Xu

(

), Jinyu Wen, Qinran Hu, Jiao Shu, Jixiang Lu, Zhihong Yang

State Key Laboratory of Advanced Electromagnetic Engineering and Technology, School of Electrical and Electronic Engineering, Huazhong University of Science and Technology, Wuhan 430074, China

NARI Group Corporation, Nanjing 211106, China

School of Electrical Engineering, Southeast University, Nanjing 210096, China

State Key Laboratory of Smart Grid Protection and Control, Nanjing 211106, China

Abstract

Keywords: Deep reinforcement learning, long short-term memory, electricity market, energy procurement, retail pricing

References(40)

[1]

J. J. Yang, J. H. Zhao, F. J. Luo, F. S. Wen, and Z. Y. Dong, “Decision-making for electricity retailers: A brief survey,” IEEE Transactions on Smart Grid, vol. 9, no. 5, pp. 4140–4153, Sep. 2018.

DOI Google Scholar

[2]

F. L. Meng and X. J. Zeng, “A profit maximization approach to demand response management with customers behavior learning in smart grid,” IEEE Transactions on Smart Grid, vol. 7, no. 3, pp. 1516–1529, May 2016.

DOI Google Scholar

[3]

J. S. Vardakas, N. Zorba, and C. V. Verikoukis, “A survey on demand response programs in smart grids: Pricing methods and optimization algorithms,” IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 152–178, Jan./Mar. 2015.

DOI Google Scholar

[4]

Z. Chen, L. Wu, and Y. Fu, “Real-time price-based demand response management for residential appliances via stochastic optimization and robust optimization,” IEEE Transactions on Smart Grid, vol. 3, no. 4, pp. 1822–1831, Dec. 2012.

DOI Google Scholar

[5]

M. Muratori and G. Rizzoni, “Residential demand response: Dynamic energy management and time-varying electricity pricing,” IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 1108–1117, Mar. 2016.

DOI Google Scholar

[6]

Y. Wang, Q. X. Chen, T. Hong, and C. Q. Kang, “Review of smart meter data analytics: Applications, methodologies, and challenges,” IEEE Transactions on Smart Grid, vol. 10, no. 3, pp. 3125–3148, May 2019.

DOI Google Scholar

[7]

H. C. Xu, K. Q. Zhang, and J. B. Zhang, “Optimal joint bidding and pricing of profit-seeking load serving entity,” IEEE Transactions on Power Systems, vol. 33, no. 5, pp. 5427–5436, Sep. 2018.

DOI Google Scholar

[8]

X. Fang, Q. R. Hu, F. X. Li, B. B. Wang, and Y. Li, “Coupon-based demand response considering wind power uncertainty: A strategic bidding model for load serving entities,” IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 1025–1037, Mar. 2016.

DOI Google Scholar

[9]

A. Sadeghi-Mobarakeh and H. Mohsenian-Rad, “Optimal bidding in performance-based regulation markets: An MPEC analysis with system dynamics,” IEEE Transactions on Power Systems, vol. 32, no. 2, pp. 1282–1292, Mar. 2017.

DOI Google Scholar

[10]

N. Li, L. J. Chen, and M. A. Dahleh, “Demand response using linear supply function bidding,” IEEE Transactions on Smart Grid, vol. 6, no. 4, pp. 1827–1838, Jul. 2015.

DOI Google Scholar

[11]

S. Maharjan, Q. Y. Zhu, Y. Zhang, S. Gjessing, and T. Basar, “Dependable demand response management in the smart grid: A Stackelberg game approach,” IEEE Transactions on Smart Grid, vol. 4, no. 1, pp. 120–132, Mar. 2013.

DOI Google Scholar

[12]

M. Jin, W. Feng, C. Marnay, and C. Spanos, “Microgrid to enable optimal distributed energy retail and end-user demand response,” Applied Energy, vol. 210, pp. 1321–1335, Jan. 2018.

DOI Google Scholar

[13]

H. C. Xu, H. B. Sun, D. Nikovski, S. Kitamura, K. Mori, and H. Hashimoto, “Deep reinforcement learning for joint bidding and pricing of load serving entity,” IEEE Transactions on Smart Grid, vol. 10, no. 6, pp. 6366–6375, Nov. 2019.

DOI Google Scholar

[14]

Y. J. Ye, D. W. Qiu, M. Y. Sun, D. Papadaskalopoulos, and G. Strbac, “Deep reinforcement learning for strategic bidding in electricity markets,” IEEE Transactions on Smart Grid, vol. 11, no. 2, pp. 1343–1355, Mar. 2020.

DOI Google Scholar

[15]

M. R. Salehizadeh and S. Soltaniyan, “Application of fuzzy Q-learning for electricity market modeling by considering renewable power penetration,” Renewable and Sustainable Energy Reviews, vol. 56, pp. 1172–1181, Apr. 2016.

DOI Google Scholar

[16]

Y. K. Liu, D. X. Zhang, and H. B. Gooi, “Data-driven decision-making strategies for electricity retailers: A deep reinforcement learning approach,” CSEE Journal of Power and Energy Systems, vol. 7, no. 2, pp. 358–367, Mar. 2021.

Google Scholar

[17]

R. Z. Lu, S. H. Hong, and X. F. Zhang, “A Dynamic pricing demand response algorithm for smart grid: Reinforcement learning approach,” Applied Energy, vol. 220, pp. 220–230, Jun. 2018.

DOI Google Scholar

[18]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed., Cambridge, MA, USA: MIT Press, 2018.

[19]

D. X. Zhang, X. Q. Han, and C. Y. Deng, “Review on the research and practice of deep learning and reinforcement learning in smart grids,” CSEE Journal of Power and Energy Systems, vol. 4, no. 3, pp. 362–370, Sep. 2018.

DOI Google Scholar

[20]

Z. D. Zhang, D. X. Zhang, and R. C. Qiu, “Deep reinforcement learning for power system applications: An overview,” CSEE Journal of Power and Energy Systems, vol. 6, no. 1, pp. 213–225, Mar. 2020.

Google Scholar

[21]

B. G. Kim, Y. Zhang, M. van der Schaar, and J. W. Lee, “Dynamic pricing and energy consumption scheduling with reinforcement learning,” IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2187–2198, Sep. 2016.

DOI Google Scholar

[22]

L. L. Wen, K. L. Zhou, J. Li, and S. Y. Wang, “Modified deep learning and reinforcement learning for an incentive-based demand response model,” Energy, vol. 205, pp. 118019, Aug. 2020.

DOI Google Scholar

[23]

X. Y. Kong, D. Q. Kong, J. T. Yao, L. Q. Bai, and J. Xiao, “Online pricing of demand response based on long short-term memory and reinforcement learning,” Applied Energy, vol. 271, pp. 114945, Aug. 2020.

DOI Google Scholar

[24]

A. G. Barto and S. Mahadevan, “Recent advances in hierarchical reinforcement learning,” Discrete Event Dynamic Systems, vol. 13, no. 4, pp. 341–379, Oct. 2003.

DOI Google Scholar

[25]

K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, Nov. 2017.

DOI Google Scholar

[26]

R. Z. Lu and S. H. Hong, “Incentive-based demand response for smart grid with reinforcement learning and deep neural network,” Applied Energy, vol. 236, pp. 937–949, Feb. 2019.

DOI Google Scholar

[27]

K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A search space Odyssey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, Oct. 2017.

DOI Google Scholar

[28]

R. S. Sutton, D. A. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” in Proceedings of the 12th International Conference on Neural Information Processing Systems, Denver, CO, USA, 2000, pp. 1057-1063.

Google Scholar

[29]

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997.

DOI Google Scholar

[30]

A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” in Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 2013, pp. 6645-6649.

DOI Google Scholar

[31]

W. C. Kong, Z. Y. Dong, Y. W. Jia, D. J. Hill, Y. Xu, and Y. Zhang, “Short-term residential load forecasting based on LSTM recurrent neural network,” IEEE Transactions on Smart Grid, vol. 10, no. 1, pp. 841–851, Jan. 2019.

DOI Google Scholar

[32]

L. Peng, S. Liu, R. Liu, and L. Wang, “Effective long short-term memory with differential evolution algorithm for electricity price prediction,” Energy, vol. 162, pp. 1301–1314, 2018.

DOI Google Scholar

[33]

C. Olah. Understanding LSTM networks. [Online]. Available: http://colah.github.io/posts/2015–08-Understanding-LSTMs

[34]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015.

DOI Google Scholar

[35]

T. P. Lillicrap J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Proceedings of the 4th International Conference Learning Represent (ICLR), San Juan, Puerto Rico, 2016, pp. 1-14.

Google Scholar

[36]

R. Caruana, “Multitask learning,” Machine Learning, vol. 28, no. 1, pp. 41–75, Jul. 1997.

DOI Google Scholar

[37]

PJM Data Miner 2. [Online]. Available: http://dataminer2.pjm.com.

[38]

M. Fahrioglu and F. L. Alvarado, “Using utility information to calibrate customer demand management behavior models,” IEEE Transactions on Power Systems, vol. 16, no. 2, pp. 317–322, May 2001.

DOI Google Scholar

[39]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012, pp. 1097-1105.

Google Scholar

[40]

D. P. Kingma and L. J. Ba, “Adam: A method for stochastic optimization,” in Proceedings of the 3rd International Conference on Learning Representations (ICLR), Diego, CA, USA, 2015, pp. 1-15.

Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 07 June 2021

Revised: 05 November 2021

Accepted: 28 December 2021

Published: 14 February 2022

Issue date: September 2022

Copyright

Acknowledgements

This work was supported in part by Natural Science Foundation of Jiangsu Province (BK20210002) and National Key R&D Program of China (2018AAA0101504).