AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

High Performance MPI over the Slingshot Interconnect

Department of Computer Science and Engineering, The Ohio State University, Columbus, OH 43210, U.S.A.
Show Author Information

Abstract

The Slingshot interconnect designed by HPE/Cray is becoming more relevant in high-performance computing with its deployment on the upcoming exascale systems. In particular, it is the interconnect empowering the first exascale and highest-ranked supercomputer in the world, Frontier. It offers various features such as adaptive routing, congestion control, and isolated workloads. The deployment of newer interconnects sparks interest related to performance, scalability, and any potential bottlenecks as they are critical elements contributing to the scalability across nodes on these systems. In this paper, we delve into the challenges the Slingshot interconnect poses with current state-of-the-art MPI (message passing interface) libraries. In particular, we look at the scalability performance when using Slingshot across nodes. We present a comprehensive evaluation using various MPI and communication libraries including Cray MPICH, OpenMPI + UCX, RCCL, and MVAPICH2 on CPUs and GPUs on the Spock system, an early access cluster deployed with Slingshot-10, AMD MI100 GPUs and AMD Epyc Rome CPUs to emulate the Frontier system. We also evaluate preliminary CPU-based support of MPI libraries on the Slingshot-11 interconnect.

Electronic Supplementary Material

Download File(s)
JCST-2210-12907-Highlights.pdf (586.1 KB)

References

【1】
【1】
 
 
Journal of Computer Science and Technology
Pages 128-145

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Khorassani KS, Chen C-C, Ramesh B, et al. High Performance MPI over the Slingshot Interconnect. Journal of Computer Science and Technology, 2023, 38(1): 128-145. https://doi.org/10.1007/s11390-023-2907-5

2295

Views

10

Crossref

7

Web of Science

9

Scopus

0

CSCD

Received: 16 October 2022
Revised: 29 October 2022
Accepted: 05 January 2023
Published: 28 February 2023
© Institute of Computing Technology, Chinese Academy of Sciences 2023