AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (40.7 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Brief Introduction of TianHe Exascale Prototype System

Ruibo WangKai Lu( )Juan ChenWenzhe ZhangJinwen LiYuan YuanPingjing LuLibo HuangShengguo LiXiaokang Fan
College of Computer, National University of Defense Technology, Changsha 410072, China.
Show Author Information

Abstract

Facing the challenges of the next generation exascale computing, National University of Defense Technology has developed a prototype system to explore opportunities, solutions, and limits toward the next generation Tianhe system. This paper briefly introduces the prototype system, which is deployed at the National Supercomputer Center in Tianjin and has a theoretical peak performance of 3.15 Pflops. A total of 512 compute nodes are found where each node has three proprietary CPUs called Matrix-2000+. The system memory is 98.3 TB, and the storage is 1.4 PB in total.

References

[1]
X.-J. Yang, X.-K. Liao, K. Lu, Q.-F. Hu, J.-Q. Song, and J.-S. Su, The TianHe-1A supercomputer: Its hardware and software, Journal of Computer Science and Technology, vol. 26, no. 3, pp. 344-351, 2011.
[2]
X.-K. Liao, Z.-B. Pang, K.-F. Wang, Y.-T. Lu, M. Xie, J. Xia, D.-Z. Dong, and G. Suo, High performance interconnect network for TianHe system, Journal of Computer Science and Technology, vol. 30, no. 2, pp. 259-272, 2015.
[3]
F. Song and J. Dongarra, Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores, in Proceedings of the 28th ACM International Conference on Supercomputing, Munich, Germany, 2014, pp. 333-342.
[4]
S. Kumar, A. Jantsch, J.-P. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani, A network on chip architecture and design methodology, in Proceedings IEEE Computer Society Annual Symposium on VLSI, Pittsburgh, PA, USA, 2002, pp. 117-124.
[5]
K. Kandalla, U. Yang, J. Keasler, T. Kolev, A. Moody, H. Subramoni, K. Tomko, J. Vienne, B. R. De Supinski, and D. K. Panda, Designing non-blocking allreduce with collective offload on infiniband clusters: A case study with conjugate gradient solvers, in 2012 IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, China, 2012, pp. 1156-1167.
[6]
J. Ajanovic, PCI express 3.0 overview, in Proceedings of Hot Chip: A Symposium on High Performance Chips, vol. 69, p. 143, 2009.
[7]
C. Lin, G. Wu, and S. Ju, Investigation of thermal characterization of a thermally enhanced FC-PBGA assembly, Journal of Electronics Cooling and Thermal Control, vol. 3, no. 3, pp. 85-93, 2013.
[8]
S. Strande, Gordon-design and performance of a 3D torus interconnect for data intensive computing, in Proceedings of HPC Advisory Council Held in Conjunction with the International Supercomputing Conference, Hamburg, Germany, 2012, p. 3.
[9]
H. Jordan, An introduction to the intelligent platform management interface, Magazine of Dell Power Solutions, DELL, Round Rock, TX, USA, 2004.
[10]
T. Kozak, P. Predki, and D. Makowski, Real-time IPMI protocol analyzer, IEEE Transactions on Nuclear Science, vol. 58, no. 4, pp. 1857-1863, 2011.
[11]
Z. Yu and H. Ji, Notice of retraction: Research of IPMI management based on BMC SOC, in Proc. of International Conference on Management and Service Science, Wuhan, China, 2010, pp. 1-3.
[12]
J. Shen, J. Fang, H. Sips, and A. L. Varbanescu, Performance gaps between openMP and openCL for multi-core CPUs, in Proc. of the 41st International Conference on Parallel Processing Workshops, Pittsburgh, PA, USA, 2012, pp. 116-125.
[13]
J. Fang, A. L. Varbanescu, X. Liao, and H. Sips, Evaluating vector data type usage in openCL kernels, Concurrency and Computation: Practice and Experience, vol. 27, no. 17, pp. 4586-4602, 2015.
[14]
J. Fang, P. Zhang, T. Tang, C. Huang, and C. Yang, Implementing and evaluating openCL on an armv8 multi-core CPU, in Proc. of 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications, Guangzhou, China, 2017, pp. 860-867.
[15]
J. Shen, J. Fang, H. Sips, and A. L. Varbanescu, An application-centric evaluation of openCL on multi-core CPUs, Parallel Computing, vol. 39, no. 12, pp. 834-850, 2013.
[16]
P. Zhang, J. Fang, C. Yang, T. Tang, C. Huang, and Z. Wang, MOCL: An efficient openCL implementation for the Matrix-2000 architecture, in Proceedings of the 15th ACM International Conference on Computing Frontiers, Ischia, Italy, 2018, pp. 26-35.
[17]
K. Chen, Y. Dou, Q. Lv, and Z. Liang, Instance-specific algorithm selection via multi-output learning, Tsinghua Science and Technology, vol. 22, no. 2, pp. 210-217, 2017.
[18]
Y. Chen, Z. Wang, L. Li, J. Zhang, X. Wan, F. Sun, and F. Zhang, Autogdeterm: Automatic geometry determination for electron tomography, Tsinghua Science and Technology, vol. 23, no. 4, pp. 369-376, 2018.
[19]
L. Qiao, B. Zhang, X. Lu, and J. Su, Adaptive linearized alternating direction method of multipliers for non-convex compositely regularized optimization problems, Tsinghua Science and Technology, vol. 22, no. 3, pp. 328-341, 2017.
[20]
X. Song, Y. Yang, and Y. Jiang, A flexible space-time tradeoff on hybrid index with bicriteria optimization, Tsinghua Science and Technology, vol. 24, no. 1, pp. 106-122, 2018.
[21]
B. S. Jena, C. Khan, and R. Sunderraman, High performance frequent subgraph mining on transaction datasets: A survey and performance comparison, Big Data Mining and Analytics, vol. 2, no. 3, pp. 159-180, 2019.
[22]
C. Gong, W. Bao, J. Liu, G. Tang, and Y. Jiang, An efficient wavefront parallel algorithm for structured three dimensional LU-SGS, Computers & Fluids, vols. 134&135, pp. 23-30, 2016.
[23]
F. Shan, L. Hou, Z. Chen, J. Chen, and B. Su, Two compressibility corrections to flamelet/progress variable model for supersonic combustion, in Proc. of the 21st AIAA International Space Planes and Hypersonics Technologies Conference, Xiamen, China, 2017, pp. 2017-2163.
[24]
F. Shan, L. Hou, Z. Chen, J. Chen, and L. Wang, Linearized correction to a flamelet-based model for hydrogen-fueled supersonic combustion, International Journal of Hydrogen Energy, vol. 42, no. 16, pp. 11937-11944, 2017.
[25]
Z. Lin, Z. Gu, X. Zhao, Y. Zhang, and H. Liu, An efficient matrix equation parallel direct solver for higher-order method of moments in solution of complex electromagnetic problems, IEEE Access, vol. 6, pp. 29784-29792, 2018.
[26]
Z. Lin, X. Zhao, Y. Zhang, and H. Liu, Higher order method of moments analysis of metallic waveguides loaded with composite metallic and dielectric structures, IEEE Transactions on Antennas and Propagation, vol. 66, no. 9, pp. 4958-4963, 2018.
[27]
X. Gan, Y. Hu, J. Liu, L. Chi, H. Xu, C. Gong, S. Li, and Y. Yan, Customizing the HPL for China accelerator, Science China Information Sciences, vol. 61, no. 4, pp. 1869-1919, 2018.
Tsinghua Science and Technology
Pages 361-369
Cite this article:
Wang R, Lu K, Chen J, et al. Brief Introduction of TianHe Exascale Prototype System. Tsinghua Science and Technology, 2021, 26(3): 361-369. https://doi.org/10.26599/TST.2020.9010009

1073

Views

92

Downloads

25

Crossref

N/A

Web of Science

30

Scopus

3

CSCD

Altmetrics

Received: 18 March 2020
Accepted: 24 March 2020
Published: 12 October 2020
© The author(s) 2021.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return