AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Research Article

Optimizing the hyper-parameters of deep reinforcement learning for building control

Shuhao Li1Shu Su1( )Xiaorui Lin2
Department of Civil Engineering, Southeast University, Nanjing, Jiangsu, China
OCT Eastern China Investment Co., Ltd., Shanghai, China
Show Author Information

Abstract

Buildings are a major energy consumer and carbon emitter, therefore it is important to improve building energy efficiency to achieve our sustainable development goal. Deep reinforcement learning (DRL), as an advanced building control method, demonstrates great potential for energy efficiency optimization and improved occupant comfort. However, the performance of DRL is highly sensitive to hyper-parameters, and selecting inappropriate hyper-parameters may lead to unstable learning or even failure. This study aims to investigate the design and application of DRL in building energy system control, with a specific focus on improving the performance of DRL controllers through hyper-parameter optimization (HPO) algorithms. It also aims to provide quantitative evaluation and adaptive validation of these optimized controllers. Two widely used algorithms, deep deterministic policy gradient (DDPG) and soft actor-critic (SAC), are used in the study and their performance is evaluated in different building environments based on the BOPTEST virtual testbed. One of the focuses of the study is to compare various HPO techniques, including tree-structured Parzen estimator (TPE), covariance matrix adaptation evolution strategy (CMA-ES), and combinatorial optimization methods, to determine the efficacy of different hyper-parameter optimization methods for DRL. The study enhances HPO efficiency through parallel computation and conducts a comprehensive quantitative assessment of the optimized DRL controllers, considering factors such as reduced energy consumption and improved comfort. The results show that the HPO algorithms significantly improve the performance of the DDPG and SAC controllers. A reduction of 56.94% and 68.74% in thermal discomfort is achieved, respectively. Additionally, the study demonstrates the applicability of the HPO-based approach for enhancing DRL controller performance across diverse building environments, providing valuable insights for the design and optimization of building DRL controllers.

References

【1】
【1】
 
 
Building Simulation
Pages 765-789

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Li S, Su S, Lin X. Optimizing the hyper-parameters of deep reinforcement learning for building control. Building Simulation, 2025, 18(4): 765-789. https://doi.org/10.1007/s12273-025-1233-y

902

Views

4

Crossref

3

Web of Science

2

Scopus

0

CSCD

Received: 11 November 2024
Revised: 14 December 2024
Accepted: 22 December 2024
Published: 04 March 2025
© Tsinghua University Press 2025