Journal Home > Volume 15 , Issue 2

Short-term building energy predictions serve as one of the fundamental tasks in building operation management. While large numbers of studies have explored the value of various supervised machine learning techniques in energy predictions, few studies have addressed the potential data shortage problem in developing data-driven models. One promising solution is data augmentation, which aims to enrich existing building data resources for reliable predictive modeling. This study proposes a deep generative modeling-based data augmentation strategy for improving short-term building energy predictions. Two types of conditional variational autoencoders have been designed for synthetic energy data generation using fully connected and one-dimensional convolutional layers respectively. Data experiments have been designed to evaluate the value of data augmentation using actual measurements from 52 buildings. The results indicate that conditional variational autoencoders are capable of generating high-quality synthetic data samples, which in turns helps to enhance the accuracy in short-term building energy predictions. The average performance enhancement ratios in terms of CV-RMSE range between 12% and 18%. Practical guidelines have been obtained to ensure the validity and quality of synthetic building energy data. The research outcomes are valuable for enhancing the robustness and reliability of data-driven models for smart building operation management.


menu
Abstract
Full text
Outline
About this article

A novel deep generative modeling-based data augmentation strategy for improving short-term building energy predictions

Show Author's information Cheng Fan1,2Meiling Chen1,2Rui Tang3( )Jiayuan Wang1,2
Key Laboratory for Resilient Infrastructures of Coastal Cities (Shenzhen University), Ministry of Education, China
Sino-Australia Joint Research Center in BIM and Smart Construction, Shenzhen University, Shenzhen, China
Building Technology & Urban Systems Division, Lawrence Berkeley National Laboratory, USA

Abstract

Short-term building energy predictions serve as one of the fundamental tasks in building operation management. While large numbers of studies have explored the value of various supervised machine learning techniques in energy predictions, few studies have addressed the potential data shortage problem in developing data-driven models. One promising solution is data augmentation, which aims to enrich existing building data resources for reliable predictive modeling. This study proposes a deep generative modeling-based data augmentation strategy for improving short-term building energy predictions. Two types of conditional variational autoencoders have been designed for synthetic energy data generation using fully connected and one-dimensional convolutional layers respectively. Data experiments have been designed to evaluate the value of data augmentation using actual measurements from 52 buildings. The results indicate that conditional variational autoencoders are capable of generating high-quality synthetic data samples, which in turns helps to enhance the accuracy in short-term building energy predictions. The average performance enhancement ratios in terms of CV-RMSE range between 12% and 18%. Practical guidelines have been obtained to ensure the validity and quality of synthetic building energy data. The research outcomes are valuable for enhancing the robustness and reliability of data-driven models for smart building operation management.

Keywords: data-driven models, building energy predictions, data augmentation, generative modeling, variational autoencoders

References(52)

Amasyali K, El-Gohary NM (2018). A review of data-driven building energy consumption prediction studies. Renewable and Sustainable Energy Reviews, 81: 1192–1205.

Antoniou A, Storkey A, Edwards H (2018). Data augmentation generative adversarial networks. arXiv: 1711.04340v3.

Baldi P (2012). Autoencoders, unsupervised learning and deep architectures. JMLR Workshop and Conference Proceedings, 27: 37–50.

Bregere M, Bessa RJ (2020). Simulating tariff impact in electrical energy consumption profiles with conditional variational autoencoders. IEEE Access, 8: 131949.

Chen Z, Xu P, Feng F, et al. (2021). Data mining algorithm and framework for identifying HVAC control strategies in large commercial buildings. Building Simulation, 14: 63–74.

Chollet F, Allaire JJ (2018). Deep Learning with R. New York: Manning Publications.

Creswell A, White T, Dumoulin V, et al. (2017). Generative adversarial networks: An overview. In: Proceedings of IEEE Signal Processing Magazine Special Issue on Deep Learning for Visual Understanding.

Fan C, Xiao F, Zhao Y (2017). A short-term building cooling load prediction method using deep learning algorithms. Applied Energy, 195: 222–233.

Fan C, Sun Y, Zhao Y, et al. (2019a). Deep learning-based feature engineering methods for improved building energy prediction. Applied Energy, 240: 35–45.

Fan C, Xiao F, Yan C, et al. (2019b). A novel methodology to explain and evaluate data-driven building energy performance models based on interpretable machine learning. Applied Energy, 235: 1551–1560.

Fan C, Wang J, Gang W, et al. (2019c). Assessment of deep recurrent neural network-based strategies for short-term building energy predictions. Applied Energy, 236: 700–710.

Fan C, Sun Y, Xiao F, et al. (2020). Statistical investigations of transfer learning-based methodology for short-term building energy predictions. Applied Energy, 262: 114499.

Fan C, Yan D, Xiao F, et al. (2021a). Advanced data analytics for enhancing building performances: From data-driven to big data- driven approaches. Building Simulation, 14: 3–24.

Fan C, Liu X, Xue P, et al. (2021b). Statistical characterization of semi-supervised neural networks for fault detection and diagnosis of air handling units. Energy and Buildings, 234: 110733.

Fan C, Liu Y, Liu X, et al, (2021c). A study on semi-supervised learning in enhancing performance of AHU unseen fault detection with limited labeled data. Sustainable Cities and Society, 70: 102874.

Fan C, Chen M, Wang X, et al. (2021d). A review on data preprocessing techniques toward efficient and reliable knowledge discovery from building operational data. Frontiers in Energy Research, 9: 652801.

Fawaz HI, Forestier G, Weber J, et al. (2018). Data augmentation using synthetic data for time series classification with deep residual networks. arXiv: 10808.02455v1.
Frid-Adar M, Klang E, Amitai M, et al. (2018). Synthetic data augmentation using GAN for improved liver lesion classification. In: Proceedings of IEEE 15th International Symposium on Biomedical Imaging. https://doi.org/10.1109/ISBI.2018.8363576
DOI
Gal Y, Ghahramani Z (2016). A theoretically grounded application of dropout in recurrent neural networks. In: Proceedings of NIPS.

Gong M, Wang J, Bai Y, Li B, Zhang L (2020). Heat load prediction of residential buildings based on discrete wavelet transform and tree-based ensemble learning. Journal of Building Engineering, 32: 101455.

Goodfellow I, Bengio Y, Courville A (2016). Deep Learning. Cambridge, MA, USA: MIT Press, USA.

Grubinger T, Chasparis GC, Natschläger T (2017). Generalized online transfer learning for climate control in residential buildings. Energy and Buildings, 139: 63–71.

Hastie T, Tibshirani R, Friedman J (2016). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. New York: Springer.

Hochreiter S, Schmidhuber J (1997). Long short-term memory. Neural Computation, 9: 1735–1780.

Kingma DP, Welling M (2013). Auto-encoding variational Bayes. arXiv: 1312.6114.
Le Guennec A, Malinowski S, Tavenard R (2016). Data augmentation for time series classification using convolutional neural networks. In: Proceedings of ECML/PKDD Workshop in Advanced Analytics and Learning on Temporal Data.

Li A, Xiao F, Fan C, et al. (2021). Development of an ANN-based building energy model for information-poor buildings using transfer learning. Building Simulation, 14: 89–101.

Miller C, Meggers F (2017). The Building Data Genome Project: An open, public data set from non-residential building electrical meters. Energy Procedia, 122: 439–444.

Ng AY, Jordan MI (2001). On discriminative vs. generative classifiers: A comparison of logistic regression and naive Bayes. In: In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS'01).

Piscitelli MS, Brandi S, Capozzoli A, et al. (2021). A data analytics- based tool for the detection and diagnosis of anomalous daily energy patterns in buildings. Building Simulation, 14: 131–147.

R Development Core Team (2008). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.

Rashid KM, Louis J (2019). Times-series data augmentation and deep learning for construction equipment activity recognition. Advanced Engineering Informatics, 42: 100944.

Ribeiro M, Grolinger K, El Yamany HF, et al. (2018). Transfer learning with seasonal and trend adjustment for cross-building energy forecasting. Energy and Buildings, 165: 352–363.

Seyedzadeh S, Rahimian FP, Rastogi P, et al. (2019). Tuning machine learning models for prediction of building energy loads. Sustainable Cities and Society, 47: 101484.

Shao S, Wang P, Yan R (2019). Generative adversarial networks for data augmentation in machine fault diagnosis. Computers in Industry, 106: 85–93.

Shao M, Wang X, Bu Z, et al. (2020). Prediction of energy consumption in hotel buildings via support vector machines. Sustainable Cities and Society, 57: 102128.

Simão M, Neto P, Gibaru O (2019). Improving novelty detection with generative adversarial networks on hand gesture data. Neurocomputing, 358: 437–445.

Sohn K, Yan X, Lee H (2015). Learning structured output representation using deep conditional generative models. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NIPS'15).

Sun Y, Haghighat F, Fung BCM (2020). A review of the-state-of- the-art in data-driven approaches for building energy prediction. Energy and Buildings, 221: 110022.

Tian C, Li C, Zhang G, et al. (2019). Data driven parallel prediction of building energy consumption using generative adversarial nets. Energy and Buildings, 186: 230–243.

Um TT, Pfister FMJ, Pichler D, et al. (2017). Data augmentation of wearable sensor data for Parkinson's disease monitoring using convolutional neural networks. In: Proceedings of ACM International Conference on Multimodal Interaction. https://doi.org/10.1145/3136755.3136817
DOI

Walker S, Khan W, Katic K, et al. (2020). Accuracy of different machine learning algorithms and added-value of predicting aggregated-level energy performance of commercial buildings. Energy and Buildings, 209: 109705.

Wang R, Lu S, Feng W (2020). A novel improved model for building energy consumption prediction based on model integration. Applied Energy, 262: 114561.

Wang Z, Hong T (2020). Generating realistic building electrical load profiles through the Generative Adversarial Network (GAN). Energy and Buildings, 224: 110299.

Wang Z, Srinivasan RS (2017). A review of artificial intelligence based building energy use prediction: Contrasting the capabilities of single and ensemble prediction models. Renewable and Sustainable Energy Reviews, 75: 796–808.

Wei Y, Zhang X, Shi Y, et al. (2018). A review of data-driven approaches for prediction and classification of building energy consumption. Renewable and Sustainable Energy Reviews, 82: 1027–1047.

Weiss K, Khoshgoftaar TM, Wang D (2016). A survey of transfer learning. Journal of Big Data, 3: 9.

Wen Q, Sun L, Song X, et al. (2020). Time series data augmentation for deep learning: A survey. arXiv: 2002.12478v1. https://doi.org/10.24963/ijcai.2021/631
DOI

Xu P, Du R, Zhang Z (2019). Predicting pipeline leakage in petrochemical system through GAN and LSTM. Knowledge-Based Systems, 175: 50–61.

Yu Z, Haghighat F, Fung BCM, et al. (2010). A decision tree method for building energy demand modeling. Energy and Buildings, 42: 1637–1646.

Zhao Y, Zhang C, Zhang Y, et al. (2020). A review of data mining technologies in building energy systems: Load prediction, pattern identification, fault detection and diagnosis. Energy and Built Environment, 1: 149–164.

Zhou Y, Chen J, Yu ZJ, et al. (2020). A novel model based on multi-grained cascade forests with wavelet denoising for indoor occupancy estimation. Building and Environment, 167: 106461.

Publication history
Copyright
Acknowledgements

Publication history

Received: 06 January 2021
Revised: 22 March 2021
Accepted: 08 April 2021
Published: 14 July 2021
Issue date: February 2022

Copyright

© Tsinghua University Press and Springer-Verlag GmbH Germany, part of Springer Nature 2021

Acknowledgements

Acknowledgements

The authors gratefully acknowledge the support of this research by the National Natural Science Foundation of China (No. 51908365, No. 71772125) and the Philosophical and Social Science Program of Guangdong Province, China (GD18YGL07).

Return