AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (5.7 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Leveraging Large Language Models to Enhance Medical Text Representation for Lung Diagnosis Prediction via Knowledge Infusion

School of Biomedical Engineering, Capital Medical University, Beijing 100069, China
Institute of Precision Medicine, Peking University Shenzhen Hospital, Shenzhen 518036, China
Department of Geriatrics, The Second Medical Center and National Clinical Research Center for Geriatric Diseases, Chinese People’s Liberation Army General Hospital, Beijing 100853, China
Show Author Information

Abstract

Medical text representation is crucial for medical natural language processing (NLP) applications. Bidirectional encoder representations from transformers (BERT) has achieved the state-of-the-art performance in general domain text representation. However, limited by the design of the pretraining task and the frequency of knowledge occurrence, it lacks understanding of medical knowledge. To overcome these problems, we proposed a selective knowledge extraction and fusion framework to enhance medical text representation. In the knowledge extraction phase, we first designed a semantic importance evaluation metric to extract internal knowledge. We then used large language models (LLMs) to extract external knowledge from systematized nomenclature of medicine clinical term (SNOMED CT). In the knowledge fusion phase, we utilized an attention mechanism and Siamese network to integrate internal knowledge and external knowledge. Extracting knowledge through large language models (LLMs) and integrating it into five different types of BERT models, we achieved significant improvements in the task of pulmonary disease text classification.

References

【1】
【1】
 
 
Tsinghua Science and Technology
Pages 418-429

{{item.num}}

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Close
Close
Cite this article:
Gao B, Dong Q, Tao T, et al. Leveraging Large Language Models to Enhance Medical Text Representation for Lung Diagnosis Prediction via Knowledge Infusion. Tsinghua Science and Technology, 2026, 31(1): 418-429. https://doi.org/10.26599/TST.2024.9010153
Part of a topical collection:

2387

Views

641

Downloads

1

Crossref

1

Web of Science

0

Scopus

0

CSCD

Received: 12 July 2024
Accepted: 19 August 2024
Published: 25 August 2025
© The author(s) 2026.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).