Journal Home > Volume 5 , Issue 1

A new online scheduling algorithm is proposed for photovoltaic (PV) systems with battery-assisted energy storage systems (BESS). The stochastic nature of renewable energy sources necessitates the employment of BESS to balance energy supplies and demands under uncertain weather conditions. The proposed online scheduling algorithm aims at minimizing the overall energy cost by performing actions such as load shifting and peak shaving through carefully scheduled BESS charging/discharging activities. The scheduling algorithm is developed by using deep deterministic policy gradient (DDPG), a deep reinforcement learning (DRL) algorithm that can deal with continuous state and action spaces. One of the main contributions of this work is a new DDPG reward function, which is designed based on the unique behaviors of energy systems. The new reward function can guide the scheduler to learn the appropriate behaviors of load shifting and peak shaving through a balanced process of exploration and exploitation. The new scheduling algorithm is tested through case studies using real world data, and the results indicate that it outperforms existing algorithms such as Deep Q-learning. The online algorithm can efficiently learn the behaviors of optimum non-casual off-line algorithms.


menu
Abstract
Full text
Outline
About this article

Deep reinforcement learning for online scheduling of photovoltaic systems with battery energy storage systems

Show Author's information Yaze Li1( )Jingxian Wu1Yanjun Pan2
Department of Electrical Engineering, University of Arkansas, Fayetteville, AR 72701, USA
Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR 72701, USA

Abstract

A new online scheduling algorithm is proposed for photovoltaic (PV) systems with battery-assisted energy storage systems (BESS). The stochastic nature of renewable energy sources necessitates the employment of BESS to balance energy supplies and demands under uncertain weather conditions. The proposed online scheduling algorithm aims at minimizing the overall energy cost by performing actions such as load shifting and peak shaving through carefully scheduled BESS charging/discharging activities. The scheduling algorithm is developed by using deep deterministic policy gradient (DDPG), a deep reinforcement learning (DRL) algorithm that can deal with continuous state and action spaces. One of the main contributions of this work is a new DDPG reward function, which is designed based on the unique behaviors of energy systems. The new reward function can guide the scheduler to learn the appropriate behaviors of load shifting and peak shaving through a balanced process of exploration and exploitation. The new scheduling algorithm is tested through case studies using real world data, and the results indicate that it outperforms existing algorithms such as Deep Q-learning. The online algorithm can efficiently learn the behaviors of optimum non-casual off-line algorithms.

Keywords: deep deterministic policy gradient (DDPG), photovoltaic (PV), battery energy storage system (BESS), Markov decision process (MDP)

References(38)

[1]
Y. Li and J. Wu, Optimum design of battery-assisted photo-voltaic energy system for a commercial application, in Proc. IEEE Power & Energy Society General Meeting (PESGM), Atlanta, GA, USA, 2019, pp. 1–5.
DOI
[2]
S. Chouhan, D. Tiwari, H. Inan, S. Khushalani-Solanki, and A. Feliachi, DER optimization to determine optimum BESS charge/discharge schedule using Linear Programming, in Proc. IEEE Power and Energy Society General Meeting (PESGM), Boston, MA, USA, 2016, pp. 1–5.
DOI
[3]
M. Jones and M. M. Peet, Solving dynamic programming with supremum terms in the objective and application to optimal battery scheduling for electricity consumers subject to demand charges, in Proc. IEEE 56th Annual Conf. Decision and Control (CDC), Melbourne, Australia, 2017, pp. 1323–1329.
DOI
[4]

Y. Li and J. Wu, Optimum integration of solar energy with battery energy storage systems, IEEE Trans. Eng. Manage., vol. 69, no. 3, pp. 697–707, 2022.

[5]

I. Naidji, M. Ben Smida, M. Khalgui, A. Bachir, Z. Li, and N. Wu, Efficient allocation strategy of energy storage systems in power grids considering contingencies, IEEE Access, vol. 7, pp. 186378–186392, 2019.

[6]

M. R. Alam, M. St-Hilaire, and T. Kunz, Computational methods for residential energy cost optimization in smart grids, ACM Comput. Surv., vol. 49, no. 1, pp. 1–34, 2017.

[7]
N. E. Koltsaklis, I. P. Panapakidis, G. C. Christoforidis, and C. E. Parisses, An MILP model for the optimal energy management of a smart household, in Proc. 16th Int. Conf. the European Energy Market (EEM), Ljubljana, Slovenia, 2019, 1–6.
DOI
[8]

U. Can Yilmaz, M. Erdem Sezgin, and M. Gol, A model predictive control for microgrids considering battery aging, J. Mod. Power Syst. Clean Energy, vol. 8, no. 2, pp. 296–304, 2020.

[9]
M. Khalid, A. V. Savkin, and V. G. Agelidis, A method for minimizing energy cost in a microgrid with hybrid renewable power generation using controlled battery energy storage, in Proc. 35th Chinese Control Conf. (CCC), Chengdu, China, 2016, pp. 8596–8600.
DOI
[10]
A. Salazar, A. Berzoy, J. M. Velni, and W. Song, Optimum energy management of islanded nanogrids through nonlinear stochastic dynamic programming, in Proc. IEEE Industry Applications Society Annual Meeting, Baltimore, MD, USA, 2019, pp. 1–8.
DOI
[11]

L. Hernandez, C. Baladron, J. M. Aguiar, B. Carro, A. J. Sanchez-Esguevillas, J. Lloret, and J. Massana, A survey on electric power demand forecasting: Future trends in smart grids, microgrids and smart buildings, IEEE Commun. Surv. Tutorials, vol. 16, no. 3, pp. 1460–1495, 2014.

[12]

F. Hafiz, M. A. Awal, A. R. de Queiroz, and I. Husain, Real-time stochastic optimization of energy storage management using deep learning-based forecasts for residential PV applications, IEEE Trans. Ind. Applicat., vol. 56, no. 3, pp. 2216–2226, 2020.

[13]
L. Tang, Y. Yi, and Y. Peng, An ensemble deep learning model for short-term load forecasting based on ARIMA and LSTM, in Proc. IEEE Int. Conf. Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Beijing, China, 2019, pp. 1–6.
DOI
[14]
C. Feng and J. Zhang, Reinforcement learning based dynamic model selection for short-term load forecasting, in Proc. IEEE Power & Energy Society Innovative Smart Grid Technologies Conf. (ISGT), Washington, DC, USA. IEEE, 2019, pp. 1–5.
DOI
[15]

L. Yu, S. Qin, M. Zhang, C. Shen, T. Jiang, and X. Guan, A review of deep reinforcement learning for smart building energy management, IEEE Internet Things J., vol. 8, no. 15, pp. 12046–12063, 2021.

[16]
B. V. Mbuwir, M. Kaffash, and G. Deconinck, Battery scheduling in a residential multi-carrier energy system using reinforcement learning, in Proc. IEEE Int. Conf. Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), Aalborg, Denmark, 2018, pp. 1–6.
DOI
[17]

Y. Wei, F. R. Yu, M. Song, and Z. Han, User scheduling and resource allocation in HetNets with hybrid energy supply: An actor-critic reinforcement learning approach, IEEE Trans. Wireless Commun., vol. 17, no. 1, pp. 680–692, 2018.

[18]
V. François-Lavet, D. Taralla, D. Ernst, and R. Fonteneau, Deep reinforcement learning solutions for energy microgrids management, in Proc. 13th European Workshop on Reinforcement Learning (EWRL 2016), Barcelona, Spain, 2016.
[19]

E. Mocanu, D. C. Mocanu, P. H. Nguyen, A. Liotta, M. E. Webber, M. Gibescu, and J. G. Slootweg, On-line building energy optimization using deep reinforcement learning, IEEE Trans. Smart Grid, vol. 10, no. 4, pp. 3698–3708, 2019.

[20]
N. Tsang, C. Cao, S. Wu, Z. Yan, A. Yousefi, A. Fred-Ojala, and I. Sidhu, Autonomous household energy management using deep reinforcement learning, in Proc. IEEE Int. Conf. Engineering, Technology and Innovation (ICE/ITMC), Valbonne Sophia-Antipolis, France, 2019, pp. 1–7.
DOI
[21]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, Continuous control with deep reinforcement learning, arXiv preprint arXiv: 1509.02971, 2015.
[22]
Z. Chen and X. Wang, Decentralized computation offloadingfor multi-user mobile edge computing: A deep reinforcement learning approach, arXiv preprint arXiv: 1812.07394, 2018.
[23]

C. Qiu, Y. Hu, Y. Chen, and B. Zeng, Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications, IEEE Internet Things J., vol. 6, no. 5, pp. 8577–8588, 2019.

[24]
R. Liessner, C. Schroer, A. Dietermann, and B. Bäker, Deep reinforcement learning for advanced energy management of hybrid electric vehicles, in Proc. 10th Int. Conf. Agents and Artificial Intelligence, Funchal, Portugal, 2018, pp. 61–72.
DOI
[25]

L. Yu, W. Xie, D. Xie, Y. Zou, D. Zhang, Z. Sun, L. Zhang, Y. Zhang, and T. Jiang, Deep reinforcement learning for smart home energy management, IEEE Internet Things J., vol. 7, no. 4, pp. 2751–2762, 2020.

[26]

Y. Gao, J. Yang, M. Yang, and Z. Li, Deep reinforcement learning based optimal schedule for a battery swapping station considering uncertainties, IEEE Trans. Ind. Applicat., vol. 56, no. 5, pp. 5775–5784, 2020.

[27]

C. Truong, M. Naumann, R. Karl, M. Müller, A. Jossen, and H. Hesse, Economics of residential photovoltaic battery systems in Germany: The case of Tesla’s powerwall, Batteries, vol. 2, no. 2, pp. 14, 2016.

[28]
A. Berrueta, J. Pascual, I. S. Martin, P. Sanchis, and A. Ursua, Influence of the aging model of lithium-ion batteries on the management of PV self-consumption systems, in Proc. IEEE Int. Conf. Environment and Electrical Engineering and 2018 IEEE Industrial and Commercial Power Systems Europe (EEEIC / I&CPS Europe), Palermo, Italy, 2018, pp. 1–5.
DOI
[29]
G. Dalal, K. Dvijotham, M. Vecerik, T. Hester, C. Paduraru, and Y. Tassa, Safe exploration in continuous action spaces, arXiv preprint arXiv: 1801.08757, 2018.
[30]

S. Nath and J. Wu, Online battery scheduling for grid-connected photo-voltaic systems, J. Energy Storage, vol. 31, p. 101713, 2020.

[31]

R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, IEEE Trans. Neural Netw., vol. 9, no. 5, p. 1054, 1998.

[32]

D. K. Maly and K. S. Kwan, Optimal battery energy storage system (BESS) charge scheduling with dynamic programming, IEE Proc. Sci. Meas. Technol., vol. 142, no. 6, pp. 453–458, 1995.

[33]

D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, Deterministic policy gradient algorithms, 31st Int. Conf. Mach. Learn. (ICML 2014), vol. 1, pp. 605–619, 2014.

[34]
M. Plappert, R. Houthooft, P. Dhariwal, S. Sidor, R. Y. Chen, X. Chen, T. Asfour, P. Abbeel, and M. Andrychowicz, Parameter space noise for exploration, arXiv preprint arXiv: 1706.01905, 2017.
[35]
[36]
[37]
Pacific Gas and Electric Company, Electric Schedule E-19: Medium general demand-metered TOU service, https://www.pge.com/tariffs/assets/pdf/tariffbook/ELEC_SCHEDS_E-19.pdf, 2023.
[38]
A. Hill, A. Raffin, M. Ernestus, A. Gleave, A. Kanervisto, R. Traore, P. Dhariwal, C. Hesse, O. Klimov, A. Nichol, et al., Stable baselines, https://github.com/hill-a/stable-baselines, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 01 July 2023
Revised: 21 September 2023
Accepted: 31 October 2023
Published: 28 March 2024
Issue date: March 2024

Copyright

© All articles included in the journal are copyrighted to the ITU and TUP.

Acknowledgements

Acknowledgment

The work was supported in part by the U.S National Science Foundation (NSF) (No. ECCS-1711087) and NSF Center for Infrastructure Trustworthiness in Energy Systems (CITES).

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license:https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return