Optimizing the hyper-parameters of deep reinforcement learning for building control

Shuhao Li; Shu Su; Xiaorui Lin

doi:10.1007/s12273-025-1233-y

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research Article

Optimizing the hyper-parameters of deep reinforcement learning for building control

Shuhao Li^¹, Shu Su^¹(

), Xiaorui Lin^²

Department of Civil Engineering, Southeast University, Nanjing, Jiangsu, China

OCT Eastern China Investment Co., Ltd., Shanghai, China

Show Author Information

Abstract

Buildings are a major energy consumer and carbon emitter, therefore it is important to improve building energy efficiency to achieve our sustainable development goal. Deep reinforcement learning (DRL), as an advanced building control method, demonstrates great potential for energy efficiency optimization and improved occupant comfort. However, the performance of DRL is highly sensitive to hyper-parameters, and selecting inappropriate hyper-parameters may lead to unstable learning or even failure. This study aims to investigate the design and application of DRL in building energy system control, with a specific focus on improving the performance of DRL controllers through hyper-parameter optimization (HPO) algorithms. It also aims to provide quantitative evaluation and adaptive validation of these optimized controllers. Two widely used algorithms, deep deterministic policy gradient (DDPG) and soft actor-critic (SAC), are used in the study and their performance is evaluated in different building environments based on the BOPTEST virtual testbed. One of the focuses of the study is to compare various HPO techniques, including tree-structured Parzen estimator (TPE), covariance matrix adaptation evolution strategy (CMA-ES), and combinatorial optimization methods, to determine the efficacy of different hyper-parameter optimization methods for DRL. The study enhances HPO efficiency through parallel computation and conducts a comprehensive quantitative assessment of the optimized DRL controllers, considering factors such as reduced energy consumption and improved comfort. The results show that the HPO algorithms significantly improve the performance of the DDPG and SAC controllers. A reduction of 56.94% and 68.74% in thermal discomfort is achieved, respectively. Additionally, the study demonstrates the applicability of the HPO-based approach for enhancing DRL controller performance across diverse building environments, providing valuable insights for the design and optimization of building DRL controllers.

Keywords

hyper-parameter optimization deep reinforcement learning building energy system optimal control BOPTEST parallelization

References

【1】

Crossref Google Scholar

Building Simulation

Volume 18 Issue 4,
April 2025

Pages 765-789

DOI: 10.1007/s12273-025-1233-y

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Li S, Su S, Lin X. Optimizing the hyper-parameters of deep reinforcement learning for building control. Building Simulation, 2025, 18(4): 765-789. https://doi.org/10.1007/s12273-025-1233-y

902

Views

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 11 November 2024

Revised: 14 December 2024

Accepted: 22 December 2024

Published: 04 March 2025