Journal Home > Volume 27 , Issue 2

Given the complex nature of data centers’ thermal management, which costs too many resources, processing time, and energy consumption, thermal awareness and thermal management powered by artificial intelligence (AI ) are the targeted study. In addition to a few research on AI techniques and models, other strategies have also been introduced in recent years. Data center models, including cooling, thermal, power, and workload models, and their relationship are factors that need to be understood in the optimal thermal management system. Simulation approaches have been proposed to help validate new models or methods used for scheduling and consolidating processes and virtual machines (VMs), hotspot identification, thermal state estimation, and power usage change. AI-powered thermal optimization leads to improved process scheduling and consolidation of VMs and eliminates the hotspot from happening. At present, research on AI-powered thermal control is still in its infancy. This paper concludes with four issues in thermal management, which will be the scope of further research.


menu
Abstract
Full text
Outline
About this article

Research Advances on AI-Powered Thermal Management for Data Centers

Show Author's information Hui Liu( )AbdusSalam AljbriJie SongJinqing JiangChun Hua
School of Metallurgy, Northeastern University, Shenyang 110819, China
Software College, Northeastern University, Shenyang 110169, China
College of Computer Science and Technology, Inner Mongolia University for Nationalities, Tongliao 010021, China

Abstract

Given the complex nature of data centers’ thermal management, which costs too many resources, processing time, and energy consumption, thermal awareness and thermal management powered by artificial intelligence (AI ) are the targeted study. In addition to a few research on AI techniques and models, other strategies have also been introduced in recent years. Data center models, including cooling, thermal, power, and workload models, and their relationship are factors that need to be understood in the optimal thermal management system. Simulation approaches have been proposed to help validate new models or methods used for scheduling and consolidating processes and virtual machines (VMs), hotspot identification, thermal state estimation, and power usage change. AI-powered thermal optimization leads to improved process scheduling and consolidation of VMs and eliminates the hotspot from happening. At present, research on AI-powered thermal control is still in its infancy. This paper concludes with four issues in thermal management, which will be the scope of further research.

Keywords: artificial intelligence, data center, thermal management, heuristic, metaheuristic

References(44)

[1]
E. Masanet, A. Shehabi, N. Lei, S. Smith, and J. Koomey, Recalibrating global data center energy-use estimates, Science, vol. 367, no. 6481, pp. 984-986, 2020.
[2]
X. Zhang, T. Lindberg, N. Xiong, V. Vyatkin, and A. Mousavi, Cooling energy consumption investigation of data center IT room with vertical placed server, Energy Procedia, vol. 105, pp. 2047-2052, 2017.
[3]
X. Gao, Z. Xu, H. N. Wang, L. Li, and X. R. Wang, Why “some” like it hot too: Thermal attack on data centers, ACM SIGMETRICS Perform. Eval. Rev., vol. 45, no. 1, pp. 23-24, 2017.
[4]
C. Nadjahi, H. Louahlia, and S. Lemasson, A review of thermal management and innovative cooling strategies for data center, Sustain. Comput.: Inform. Syst., vol. 19, pp. 14-28, 2018.
[5]
J. G. Yao, H. B. Guan, J. Y. Luo, L. Rao, and X. Liu, Adaptive power management through thermal aware workload balancing in internet data centers, IEEE Trans. Parallel Distrib. Syst., vol. 26, no. 9, pp. 2400-2409, 2015.
[6]
A. Marcel, P. Cristian, P. Eugen, P. Claudia, T. Cioara, I. Anghel, and S. Ioan, Thermal aware workload consolidation in cloud data centers, in Proc. IEEE 12th Int. Conf. Intelligent Computer Communication and Processing, Cluj-Napoca, Romania, 2016, pp. 377-384.
[7]
Q. Fang, J. Wang, Q. Gong, and M. X. Song, Thermal-aware energy management of an HPC data center via two- time-scale control, IEEE Trans. Ind. Inform., vol. 13, no. 5, pp. 2260-2269, 2017.
[8]
S. U. R. Malik, K. Bilal, S. U. Khan, B. Veeravalli, K. Q. Li, and A. Y. Zomaya, Modeling and analysis of the thermal properties exhibited by cyberphysical data centers, IEEE Syst.J., vol. 11, no. 1, pp. 163-172, 2017.
[9]
E. K. Lee, H. Viswanathan, and D. Pompili, Model-based thermal anomaly detection in cloud datacenters using thermal imaging, IEEE Trans. Cloud Comput., vol. 6, no. 2, pp. 330-343, 2018.
[10]
D. De Chiara, M. Chinnici, and A. L. Kor, Data mining for big dataset-related thermal analysis of high performance computing (HPC) data center, in Proc. 20th Int. Conf. Computational Science, Amsterdam, The Netherlands, 2020, pp. 367-381.
[11]
X. G. Zhao, T. Peng, X. Qin, Q. P. Hu, L. Ding, and Z. J. Fang, Feedback control scheduling in energy-efficient and thermal-aware data centers, IEEE Trans. Syst., Man, Cybern.: Syst., vol. 46, no. 1, pp. 48-60, 2016.
[12]
L. J. Fu, J. X. Wan, J. Yang, D. D. Cao, and G. F. Zhang, Dynamic thermal and IT resource management strategies for data center energy minimization, J. Cloud Comput., vol. 6, no. 1, p. 25, 2017.
[13]
H. Z. Liu, B. S. Liu, L. T. Yang, M. Lin, Y. H. Deng, K. Bilal, and S. U. Khan, Thermal-aware and DVFS-enabled big data task scheduling for data centers, IEEE Trans. Big Data, vol. 4, no. 2, pp. 177-190, 2018.
[14]
T. Van Damme, C. De Persis, and P. Tesi, Optimized thermal-aware job scheduling and control of data centers, IEEE Trans. Control Syst. Technol., vol. 27, no. 2, pp. 760-771, 2019.
[15]
S. Ilager, K. Ramamohanarao, and R. Buyya, ETAS: energy and thermal-aware dynamic virtual machine consolidation in cloud data center with proactive hotspot mitigation, Concurr. Comput.: Pract. Exp., vol. 31, no. 17, p. e5221, 2019.
[16]
M. A. Oxley, E. Jonardi, S. Pasricha, A. A. Maciejewski, H. J. Siegel, P. J. Burns, and G. A. Koenig, Rate-based thermal, power, and co-location aware resource management for heterogeneous data centers, J. Parallel Distrib. Comput., vol. 112, pp. 126-139, 2018.
[17]
J. M. Pierson, P. Stolf, H. Y. Sun, and H. Casanova, Milp formulations for spatio-temporal thermal-aware scheduling in cloud and HPC datacenters, Cluster Comput., vol. 23, no. 2, pp. 421-439, 2020.
[18]
M. T. Chaudhry, T. C. Ling, S. A. Hussain, and X. Z. Lu, Thermal-aware relocation of servers in green data centers, Front. Inf. Technol. Electron. Eng., vol. 16, no. 2, pp. 119-134, 2015.
[19]
Q. Wang, M. X. Song, Q. Fang, and J. Wang, Thermal-aware flow field optimization for energy saving of data centers, in Proc. Ann. American Control Conf., Milwaukee, WI, USA, 2018, pp. 3744-3749.
[20]
A. Ali and Ö. Özkasap, Workload management in distributed data centers: Thermal and spatial awareness, in Proc. IEEE Int. Conf. Smart Cloud, New York, NY, USA, 2016, pp. 158-163.
[21]
W. L. Zheng, K. Ma, and X. R. Wang, TE-Shave: reducing data center capital and operating expenses with thermal energy storage, IEEE Trans. Comput., vol. 64, no. 11, pp. 3278-3292, 2015.
[22]
S. M. Mirhoseininejad, H. Moazamigoodarzi, G. Badawy, and D. G. Down, Joint data center cooling and workload management: A thermal-aware approach, Future Generation Comput. Syst., vol. 104, pp. 174-186, 2020.
[23]
M. Kheradmandi and D. G. Down, Data driven fault tolerant thermal management of data centers, in Proc. Int. Conf. Computing, Networking and Communications, Big Island, HI, USA, 2020, pp. 736-740.
[24]
L. Cupertino, G. Da Costa, A. Oleksiak, W. Pia¸tek, J. M. Pierson, J. Salom, L. Sisó, P. Stolf, H. Y. Sun, and T. Zilio, Energy-efficient, thermal-aware modeling and simulation of data centers: The CoolEmAll approach and evaluation results, Ad Hoc Netw., vol. 25, pp. 535-553, 2015.
[25]
X. H. Zhu, W. X. Jiang, F. M. Liu, Q. X. Zhang, L. Pan, Q. Chen, and Z. Y. Jia, Heat to power: Thermal energy harvesting and recycling for warm water-cooled datacenters, in Proc. ACM/IEEE 47th Ann. Int. Symp. Computer Architecture, 2020, pp. 405-418.
[26]
M. T. Chaudhry, T. C. Ling, A. Manzoor, S. A. Hussain, and J. Kim, Thermal-aware scheduling in green data centers, ACM Comput. Surv., vol. 47, no. 3, p. 39, 2015.
[27]
J. V. Wang, C. T. Cheng, and C. K. Tse, A power and thermal-aware virtual machine allocation mechanism for Cloud data centers, in Proc. IEEE Int. Conf. Communication Workshop, London, UK, 2015, pp. 2850-2855.
[28]
H. Y. Sun, P. Stolf, and J. M. Pierson, Spatio-temporal thermal-aware scheduling for homogeneous high-performance computing datacenters, Future Generation Comput. Syst., vol. 71, pp. 157-170, 2017.
[29]
S. M. Mirhoseininejad, F. M. Garcia, G. Badawy, and D. G. Down, ALTM: Adaptive learning-based thermal model for temperature predictions in data centers, in Proc. IEEE Sustainability Through ICT Summit, Montreal, Canada, 2019, pp. 1-6.
[30]
D. Han and T. Shu, Thermal-aware energy-efficient task scheduling for DVFS-enabled data centers, in Proc. Int. Conf. Computing, Networking and Communications, Garden Grove, CA, USA, 2015, pp. 536-540.
[31]
W. Piatek, A. Oleksiak, and M. vor dem Berge, Modeling impact of power- and thermal-aware fans management on data center energy consumption, in Proc. ACM 6th Int. Conf. Future Energy Systems, Bangalore, India, 2015, pp. 253-258.
[32]
X. Zhao, Y. J. Lu, Z. Li, J. Tan, Y. Q. Feng, and Y. Tao, Explicitly consider server-attached fans for thermal modeling in edge data centers, in Proc. 11th ACM Int. Conf. Future Energy Systems, Virtual Event, Australia, 2020, pp. 554-559.
[33]
R. Ullah, N. Ahmad, S. U. R. Malik, S. Akbar, and A. Anjum, Simulator for modeling, analysis, and visualizations of thermal status in data centers, Sustain. Comput.: Inform. Syst., vol. 19, pp. 324-340, 2018.
[34]
Y. Zhou, Y. Q. Chen, S. Taneja, A. Chavan, X. Qin, and J. F. Zhang, ThermoBench: A thermal efficiency benchmark for clusters in data centers, Parallel Comput., vol. 98, p. 102671, 2020.
[35]
M. T. Chaudhry, M. H. Jamal, Z. Gillani, W. Anwar, and M. S. Khan, Thermal-benchmarking for cloud hosting green data centers, Sustain. Comput.: Inform. Syst., vol. 25, p. 100357, 2020.
[36]
J. Sjölund, M. Vesterlund, N. Delbosc, A. Khan, and J. Summers, Validated thermal air management simulations of data centers using remote graphics processing units, in Proc. IECON 2018 - 44th Ann. Conf. IEEE Industrial Electronics Society, Washington, DC, USA, 2018, pp. 4920-4925.
[37]
S. Clement, D. McKee, and J. Xu, A service-oriented Co-simulation: holistic data center modelling using thermal, power and computational simulations, in Proc. 10th Int. Conf. Utility and Cloud Computing, Austin, TX, USA, 2017, pp. 91-99.
[38]
H. Liu, T. Xie, J. Ran, and S. Gao, A modified approach for thermal distribution monitoring of the container data center by thermal image registration, in Proc. 9th Int. Congress on Image and Signal Processing, BioMedical Engineering and Informatics, Datong, China, 2016, pp. 259-263.
[39]
J. Pastor and J. M. Menaud, SeDuCe: toward a testbed for research on thermal and power management in datacenters, in Proc. 9th Int. Conf. Future Energy Systems, Karlsruhe, Germany, 2018, pp. 513-518.
[40]
A. M. Al-Qawasmeh, S. Pasricha, A. A. Maciejewski, and H. J. Siegel, Power and thermal-aware workload allocation in heterogeneous data centers, IEEE Trans. Comput., vol. 64, no. 2, pp. 477-491, 2015.
[41]
E. K. Lee, H. Viswanathan, and D. Pompili, Proactive thermal-aware resource management in virtualized HPC cloud datacenters, IEEE Trans. Cloud Comput., vol. 5, no. 2, pp. 234-248, 2017.
[42]
M. Jafarizadeh, P. J. Tsai, and R. Zheng, Thermal piloting: a novel approach for sensor localization in data center monitoring, in Proc. 15th Ann. Int. Conf. Distributed Computing in Sensor Systems, Santorini, Greece, 2019, pp. 17-24.
[43]
W. Zhang, X. Chen, and J. H. Jiang, A multi-objective optimization method of initial virtual machine fault-tolerant placement for star topological data centers of cloud systems, Tsinghua Science and Technology, vol. 26, no. 1, pp. 95-111, 2021.
[44]
M. Skach, M. Arora, C. H. Hsu, Q. Li, D. Tullsen, L. J. Tang, and J. Mars, Thermal time shifting: decreasing data center cooling costs with phase-change materials, IEEE Internet Comput., vol. 21, no. 4, pp. 34-43, 2017.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 31 December 2020
Revised: 08 February 2021
Accepted: 28 February 2021
Published: 29 September 2021
Issue date: April 2022

Copyright

© The author(s) 2022

Acknowledgements

This paper was supported by the National Natural Science Foundation of China (Nos. 61662057 and 61672143).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return