Journal Home > Volume 14 , Issue 1

Supervisory control can be used to optimize the HVAC system operation and achieve building energy conservation, while reinforcement learning (RL) is considered as a promising model-free supervisory control method. In this paper, we apply RL algorithm to the operation optimization of air-conditioning (AC) system and propose an innovative RL-based model-free control strategy combining rule-based and RL-based control algorithm as well as complete application process. We use a variable air volume (VAV) air-conditioning system for a single-storey office building as a case study to validate the optimization performance of the RL-based controller. We select control strategies with the rule-based control controller (RBC) and proportional-integral-derivative (PID) controller respectively as the reference cases. The results show that, for the air supply of single zone, the RL controller performs the best in terms of both non-comfortable time and energy costs of AC system after one-year exploration learning. The total energy consumption of AC system reduced by 7.7% and 4.7%, respectively compared with RBC and PID strategies. For the air supply of multi-zone, the performance of RL controller begins to outperform the reference strategies after two-year exploration learning and two-year buffer stage. From the seventh year on, RL controller performs much better in terms of both non-comfortable time and operating costs of AC system, while the operating cost of AC system is reduced by 2.7% to 4.6% compared with the reference strategies. In addition, RL controller is more suitable for small-scale operation optimization problems.


menu
Abstract
Full text
Outline
About this article

Study on the application of reinforcement learning in the operation optimization of HVAC system

Show Author's information Xiaolei Yuan1Yiqun Pan1( )Jianrong Yang2Weitong Wang3Zhizhong Huang4
School of Mechanical Engineering, Tongji University, 4800 Cao’an Road, Shanghai 201804, China
Shanghai Research Institute of Building Sciences, Shanghai, China
Kuaishou Co. Ltd., Beijing, China
Sino-German College of Applied Sciences, Tongji University, Shanghai 201804, China

Abstract

Supervisory control can be used to optimize the HVAC system operation and achieve building energy conservation, while reinforcement learning (RL) is considered as a promising model-free supervisory control method. In this paper, we apply RL algorithm to the operation optimization of air-conditioning (AC) system and propose an innovative RL-based model-free control strategy combining rule-based and RL-based control algorithm as well as complete application process. We use a variable air volume (VAV) air-conditioning system for a single-storey office building as a case study to validate the optimization performance of the RL-based controller. We select control strategies with the rule-based control controller (RBC) and proportional-integral-derivative (PID) controller respectively as the reference cases. The results show that, for the air supply of single zone, the RL controller performs the best in terms of both non-comfortable time and energy costs of AC system after one-year exploration learning. The total energy consumption of AC system reduced by 7.7% and 4.7%, respectively compared with RBC and PID strategies. For the air supply of multi-zone, the performance of RL controller begins to outperform the reference strategies after two-year exploration learning and two-year buffer stage. From the seventh year on, RL controller performs much better in terms of both non-comfortable time and operating costs of AC system, while the operating cost of AC system is reduced by 2.7% to 4.6% compared with the reference strategies. In addition, RL controller is more suitable for small-scale operation optimization problems.

Keywords: reinforcement learning, energy saving, HVAC system, control strategy, operation optimization, VAV system

References(46)

Baird L (1995). Residual algorithms: Reinforcement learning with function approximation. In: Proceedings of the 12th International Conference on Machine Learning, Miami, FL, USA.
DOI
Baldi S, Michailidis I, Ravanis C, Kosmatopoulos EB (2015). Model- based and model-free “plug-and-play” building energy efficient control. Applied Energy, 154: 829-841.
Barrett E, Linder S (2015). Autonomous HVAC control: A reinforcement learning approach. In: Bifet A. et al. (eds), Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science, vol 9286. Cham, Switzerland: Springer.
Brémaud P (1999). Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues. New York: Springer.
DOI
Cheng Z, Zhao Q, Wang F, Jiang Y, Xia L, Ding J (2016). Satisfaction based Q-learning for integrated lighting and blind control. Energy and Buildings, 127: 43-55.
Costa A, Keane MM, Torrens JI, Corry E (2013). Building operation and energy performance: Monitoring, analysis and optimisation toolkit. Applied Energy, 101: 310-316.
Costanzo GT, Iacovella S, Ruelens F, Leurs T, Claessens BJ (2016). Experimental analysis of data-driven control for a building heating system. Sustainable Energy, Grids and Networks, 6: 81-90.
Curtiss PS, Brandemuehl MJ, Kreider JF (1994). Energy management in central HVAC plants using neural networks. ASHRAE Transactions, 100(1): 476-493.
Defazio A, Graepel T (2014). A comparison of learning algorithms on the arcade learning environment. arXiv:1410.8620
DOE (2011). Building Energy Data Book. US Department of Energy. Available at http://buildingsdatabook.eren.doe.gov/.
Doll BB, Bath KG, Daw ND, Frank MJ (2016). Variability in dopamine genes dissociates model-based and model-free reinforcement learning. Journal of Neuroscience, 36: 1211-1222.
Dong B, O’Neill Z, Luo D, Bailey T (2014). Development and calibration of an online energy model for campus buildings. Energy and Buildings, 76: 316-327.
Dong B, Yan D, Li Z, Jin Y, Feng X, Fontenot H (2018). Modeling occupancy and behavior for better building design and operation—A critical review. Building Simulation, 11: 899-921.
Ernst D, Geurts P, Wehenkel PL (2005). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6: 503-556.
Gao Y, Chen S, Lu X (2004). A review of reinforcement learning. Journal of Automation, 30(1): 86-100. (in Chinese)
Goyal S, Ingley HA, Barooah P (2013). Occupancy-based zone-climate control for energy-efficient buildings: Complexity vs. performance. Applied Energy, 106: 209-221.
Gunay HB, Ouf M, Newsham G, O’Brien W (2019). Sensitivity analysis and optimization of building operations. Energy and Buildings, 199: 164-175.
Halperin I (2019). The QLBS Q-learner goes NuQLear: Fitted Q iteration, inverse RL, and option portfolios. Quantitative Finance, 19: 1543-1553
Han M, May R, Zhang X, Wang X, Pan S, Yan D, Jin Y, Xu L (2019). A review of reinforcement learning methodologies for controlling occupant comfort in buildings. Sustainable Cities and Society, 51: 101748.
House JM, Smith TF (1995). System approach to optimal control for HVAC and building systems. ASHRAE Transactions, 101(2): 647-660.
Huang X (2017). Optimal control based on experience replay and Q-Learning. Computer Engineering and Design, 38(5): 1352-1355. (in Chinese)
Jaafra Y, Laurent JL, Deruyver A, Naceur MS (2019). Reinforcement learning for neural architecture search: A review. Image and Vision Computing, 89: 57-66.
Jung W, Jazizadeh F (2019). Human-in-the-loop HVAC operations: A quantitative review on occupancy, comfort, and energy-efficiency dimensions. Applied Energy, 239: 1471-1508.
Killian M, Kozek M (2016). Ten questions concerning model predictive control for energy efficient buildings. Building and Environment, 105: 403-412.
Lange S. Gabel ST, Riedmiller M (2012). Batch reinforcement learning. In: Wiering M, van Otterlo M (eds), Reinforcement Learning. Berlin: Springer. pp. 45-73.
DOI
Li J, Poulton G, Platt G, Wall J, James G (2010). Dynamic zone modelling for HVAC system control. International Journal of Modelling, Identification and Control, 9: 5-14.
Li B, Xia L (2015). A multi-grid reinforcement learning method for energy conservation and comfort of HVAC in buildings. In: Proceedings of IEEE International Conference on Automation Science and Engineering, Gothenburg, Sweden.
DOI
Ling KV, Dexter AL (1994). Expert control of air-conditioning plant. Automatica, 30: 761-773.
Liu S, Henze GP (2006). Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory: Part 1. Theoretical foundation. Energy and Buildings, 38: 142-147.
Mason K, Grijalva S (2019). A review of reinforcement learning for autonomous building energy management. Computers & Electrical Engineering, 78: 300-312.
Mbuwir BV, Ruelens F, Spiessens F, Deconinck G (2017). Battery energy management in a microgrid using batch reinforcement learning. Energies, 10: 1846.
MOHURD (2012). Design code for heating Ventilation and air conditioning of civil buildings (GB50736-2012). Ministry of Housing and Urban-rural Development of China. (in Chinese)
Nguyen ND, Nguyen T, Nahavandi S (2019). Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing, 359: 58-68.
Niu F, O’Neill Z, O’Neill C (2018). Data-driven based estimation of HVAC energy consumption using an improved Fourier series decomposition in buildings. Building Simulation, 11: 633-645.
Ruelens F, Iacovella S, Claessens BJ, Belmans R (2015). Learning agent for a heat-pump thermostat with a set-back strategy using model-free reinforcement learning. Energies, 8: 8300-8318.
Russek EM, Momennejad I, Botvinick MM, Gershman SJ, Daw ND (2017). Predictive representations can link model-based reinforcement learning to model-free mechanisms. PLoS Computational Biology, 3(9): e1005768
Široký J, Oldewurtel F, Cigler J, Prívara S (2011). Experimental analysis of model predictive control for an energy efficient building heating system. Applied Energy, 88: 3079-3087.
Sutton RS, Barto AG (1998). Reinforcement Learning: An Introduction. Cambridge, MA, USA: MIT Press, .
DOI
TRNSYS (2017). Transient System Simulation (TRNSYS) Program Documentation.
Urieli D, Stone P (2013). A learning agent for heat-pump thermostat control. In: Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Saint Paul, MN, USA.
van Hasselt H (2010). Double Q-Learning. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems.
Wang S, Ma Z (2008). Supervisory and optimal control of building HVAC systems: A review. HVAC&R Research, 14: 3-32.
Watkins CJCH, Dayan P (1992). Q-learning. Machine Learning, 8: 279-292.
Wei T, Wang Y, Zhu Q (2017). Deep reinforcement learning for building HVAC Control. In: Proceedings of the 54th Annual Design Automation Conference, Austin, TX, USA.
DOI
You C, Lu J, Filev D, Tsiotras P (2019). Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robotics and Autonomous Systems, 114: 1-18.
Zhao H, Magoulès F (2012). A review on the prediction of building energy consumption. Renewable and Sustainable Energy Reviews, 16: 3586-3592.
Publication history
Copyright
Acknowledgements

Publication history

Received: 30 August 2019
Accepted: 17 December 2019
Published: 23 March 2020
Issue date: February 2021

Copyright

© Tsinghua University Press and Springer-Verlag GmbH Germany, part of Springer Nature 2020

Acknowledgements

This study is supported by the Thirteenth Five-Year National Key Research and Development Program "Study on the Technical Standard System for Post-evaluation of Green Building Performance", Ministry of Science and Technology of China (No. 2016YFC0700105).

Return