Prompting Is Not Enough: Exploring Knowledge Integration and Controllable Generation on Large Language Models

Tingjia Shen; Hao Wang; Chuan Qin; Ruijun Sun; Yang Song; Defu Lian; Hengshu Zhu; Enhong Chen

doi:10.26599/BDMA.2025.9020052

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Search articles, authors, keywords, DOl and etc.

Published Date

Reset Search

{{expandStatus?'Exit ':''}}Advanced Search

Journals A - Z

About Us

Publish with Us

Support

PDF (2.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Prompting Is Not Enough: Exploring Knowledge Integration and Controllable Generation on Large Language Models

Tingjia Shen^¹, Hao Wang^¹(

), Chuan Qin^²(

), Ruijun Sun^¹, Yang Song^³, Defu Lian^¹, Hengshu Zhu^², Enhong Chen^¹

1School of Computer Science and Technology, University of Science and Technology of China, Hefei 230027, China

2Computer Network Information Center, Chinese Academy of Sciences, Beijing 100083, China

3BOSS Zhipin Career Science Lab, Beijing 100028, China

Show Author Information

Abstract

Recently, with the rapid advancements in Large Language Models (LLMs), LLM-based Open-domain Question Answering (OpenQA) methods have reaped the benefits of emergent understanding and answering capabilities enabled by massive parameters compared to traditional methods. However, most of these methods encounter two critical challenges: how to integrate knowledge into LLMs effectively and how to adaptively generate results with specific answer formats. To address these challenges, we propose a novel framework, which aims to improve the OpenQA performance by exploring knowledge integration and controllable generation on LLMs simultaneously, namely GenKI. Specifically, we first train a dense passage retrieval model to retrieve associated knowledge from a given knowledge base. Subsequently, we introduce a novel knowledge integration model that incorporates the retrieval knowledge into instructions during fine-tuning to intensify the model. Furthermore, to enable controllable generation in LLMs, we leverage a certain fine-tuned LLM and an ensemble framework based on text consistency incorporating all coherence, fluency, and answer format assurance. Finally, extensive experiments conducted on three datasets with diverse answer formats demonstrate the effectiveness of GenKI with comparison of state-of-the-art baselines. Moreover, ablation studies have disclosed a linear relationship between the frequency of retrieved knowledge and the model’s ability to recall knowledge accurately with the ground truth. Tests focusing on the out-of-domain scenario and knowledge base independence scenario have further affirmed the robustness and controllable capability of GenKI. Our code of GenKI is available at https://github.com/USTC-StarTeam/GenKI.

Keywords

Open domain Question Answering (OpenQA)question answering Large Language Model (LLM)

References

【1】

Crossref Google Scholar

Big Data Mining and Analytics

Volume 9 Issue 2,
April 2026

Pages 563-579

DOI: 10.26599/BDMA.2025.9020052

	{{item.num}}
{{version.versionName}} Author Response
{{version.versionName}} Review comment

Comments on this article

Go to comment

< Back to all reports

Review Status: {{reviewData.commendedNum}} Commended , {{reviewData.revisionRequiredNum}} Revision Required , {{reviewData.notCommendedNum}} Not Commended Under Peer Review

Review Comment

Cite this Report

. . , , {{reviewData.reportCite.doi}}

Cite this article:

Shen T, Wang H, Qin C, et al. Prompting Is Not Enough: Exploring Knowledge Integration and Controllable Generation on Large Language Models. Big Data Mining and Analytics, 2026, 9(2): 563-579. https://doi.org/10.26599/BDMA.2025.9020052

910

Views

Downloads

Crossref

Web of Science

Scopus

CSCD

Google Scholar
Citation

Received: 30 June 2024

Revised: 23 October 2024

Accepted: 30 April 2025

Published: 09 February 2026

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).