AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

Article Link

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Research | Open Access

An empirical study of Qwen3 quantization

Xingyu Zheng^¹

, Yuye Li^²

, Haoran Chu^¹

, Yue Feng^³

, Xudong Ma^¹

, Zining Wang^{¹^,⁴}

, Jie Luo^¹

, Jinyang Guo^³

, Haotong Qin^⁵

(

), Michele Magno^⁵

, Xianglong Liu^¹

School of Computer Science and Engineering, Beihang University, Beijing, China

School of Computer Science and Technology, Xidian University, Xi’an, China

School of Artificial Intelligence, Beihang University, Beijing, China

ByteDance, Beijing, China

Department of Information Technology and Electrical Engineering, ETH Zurich, Zürich, Switzerland

Show Author Information

Abstract

The Qwen series has emerged as a leading family of open-source large language models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks. With the recent release of Qwen3, which exhibits superior performance across diverse benchmarks, there is an increased interest in the efficient deployment of these models in resource-constrained environments. Low-bit quantization presents a promising solution, yet its impact on Qwen3’s performance remains underexplored. This study conducts a systematic evaluation of Qwen3’s robustness under various quantization settings, aiming to identify both the opportunities and the challenges inherent in compressing this state-of-the-art model. We rigorously assess 5 existing classic post-training quantization techniques applied to Qwen3, spanning bit-widths from 1 to 8 bits, and evaluate their effectiveness across multiple datasets. Our findings reveal that while Qwen3 maintains competitive performance at moderate bit-widths, it experiences notable degradation in linguistic tasks under ultra-low precision, underscoring the persistent hurdles in LLM compression. These results emphasize the need for further research to mitigate performance loss in extreme quantization scenarios. We anticipate that this empirical analysis will provide actionable insights for advancing quantization methods tailored to Qwen3 and future LLMs, ultimately enhancing their practicality without compromising accuracy.

Keywords

Quantization Large language models Model compression Deep learning

References

【1】

Crossref Google Scholar

Visual Intelligence

Volume 4,
2026

Article number: 11

DOI: 10.1007/s44267-026-00114-4

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Zheng X, Li Y, Chu H, et al. An empirical study of Qwen3 quantization. Visual Intelligence, 2026, 4: 11. https://doi.org/10.1007/s44267-026-00114-4

276

Views

Crossref

Google Scholar
Citation

Received: 19 November 2025

Revised: 19 March 2026

Accepted: 24 March 2026

Published: 16 April 2026

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.