Semi-Supervised Prefix Tuning of Large Language Models for Industrial Fault Diagnosis with Big Data

Gecheng Chen; Jiahao Yuan; Jiayu Yao; Zheng Luo; Jianqiang Li; Chengwen Luo

doi:10.26599/BDMA.2025.9020038

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (4.6 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Semi-Supervised Prefix Tuning of Large Language Models for Industrial Fault Diagnosis with Big Data

Gecheng Chen^¹, Jiahao Yuan^¹, Jiayu Yao^¹, Zheng Luo^², Jianqiang Li^¹, Chengwen Luo^¹(

)

1School of Artificial Intelligence and National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen 518060, China

2Faculty of Engineering, The University of Hong Kong, Hong Kong 999077, China

Show Author Information

Abstract

Industrial fault diagnosis is crucial for ensuring the safety and efficiency of modern production systems. Industrial big data, particularly large-scale tabular data capturing multivariate time-series processes, offer valuable operational insights. Existing methods face significant challenges due to extreme label scarcity and massive unlabeled data volumes. Large Language Models (LLMs) hold great potential to address these issues due to their strong heterogeneous and few-shot learning capabilities. However, the application of LLMs to fault diagnosis with industrial big data, especially for tabular data, remains unexplored. In view of this, we propose a novel semi-supervised prefix tuning of LLMs for fault diagnosis with industrial big data. We first generate auxiliary prediction tasks based on the unlabeled data as the semi-supervised training materials for LLMs. Then we design a prefix-based soft embedding layer to fine-tune the LLMs, so that the model is able to learn the task-specific information in a parameter-efficient way. To make the model applicable to industrial big data, we also implement the Sparse Gaussian Processes (SGP) to filter the most informative samples to relieve the computational cost. Finally, we design a hybrid prompt template to effectively combine the hard and soft prompts and formulate the final prediction prompt for the industrial diagnosis tasks. The experiments have proven the superiority of the proposed method.

Keywords

Industrial big data Large Language Models (LLMs)parameter-efficient fine-tuning Sparse Gaussian Processes (SGP)prompt engineering

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 8 Issue 6,
December 2025

Pages 1353-1368

DOI: 10.26599/BDMA.2025.9020038

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Chen G, Yuan J, Yao J, et al. Semi-Supervised Prefix Tuning of Large Language Models for Industrial Fault Diagnosis with Big Data. Big Data Mining and Analytics, 2025, 8(6): 1353-1368. https://doi.org/10.26599/BDMA.2025.9020038

1192

Views

109

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 08 February 2025

Revised: 11 March 2025

Accepted: 02 April 2025

Published: 19 September 2025

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).