AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (1.4 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Review | Open Access

A Survey on Accelerated Technologies for Mixture-of-Experts Model Training Systems

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China, and also with School of Computer Science and Engineering, Qinghai University, Xining 810000, China
Show Author Information

Abstract

Mixture-of-Experts (MoE) models have emerged as a transformative paradigm for scaling Large Language Models (LLMs), enabling unprecedented model capacity while maintaining computational efficiency through sparse activation mechanisms. However, the unique architectural characteristics of MoE models introduce significant system-level challenges that fundamentally differ from traditional dense models. These challenges necessitate specialized system optimizations tailored to MoE’s distinctive properties. This survey systematically analyzes accelerated technologies for MoE training systems, discussing recent advances across four critical optimization dimensions: hybrid parallel computing, comprehensive memory management, fine-grained communication scheduling, and adaptive load balancing. Our analysis reveals a paradigm shift from computation-centric to workload-centric optimization strategies. What’s more, we identify emerging research directions including machine learning-guided load balancing, cross-layer optimization frameworks, and hardware-software co-design for MoE training workloads. This work aims to provide researchers and system engineers with a comprehensive technical reference to support the design of more efficient and scalable next-generation MoE training systems.

References

【1】
【1】
 
 
Tsinghua Science and Technology
Pages 1411-1439

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Zhang Q, Zhai J, Zheng W. A Survey on Accelerated Technologies for Mixture-of-Experts Model Training Systems. Tsinghua Science and Technology, 2026, 31(3): 1411-1439. https://doi.org/10.26599/TST.2025.9010169
Part of a topical collection:

2969

Views

212

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Received: 27 July 2025
Revised: 07 September 2025
Accepted: 13 October 2025
Published: 19 December 2025
© The author(s) 2026.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).