AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (5.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

DPN: Dynamics Priori Networks for Radiology Report Generation

Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China, and also with University of Chinese Academy of Sciences, Beijing 100049, China
School of Artificial Intelligence, Shenzhen Polytechnic University, Shenzhen 518055, China
Show Author Information

Abstract

Radiology report generation is of significant importance. Unlike standard image captioning tasks, radiology report generation faces more pronounced visual and textual biases due to constrained data availability, making it increasingly reliant on prior knowledge in this context. In this paper, we introduce a radiology report generation network termed Dynamics Priori Networks (DPN), which leverages a dynamic knowledge graph and prior knowledge. Concretely, we establish an adaptable graph network and harness both medical domain knowledge and expert insights to enhance the model’s intelligence. Notably, we introduce an image-text contrastive module and an image-text matching module to enhance the quality of the generated results. Our method is evaluated on two widely available datasets: X-ray collection from Indiana University (IU X-ray) and Medical Information Mart for Intensive Care, Chest X-Ray (MIMIC-CXR), where it demonstrates superior performance, particularly excelling in critical metrics.

References

[1]
Z. Chen, Y. Song, T. H. Chang, and X. Wan, Generating radiology reports via memory-driven transformer, arXiv preprint arXiv: 2010.16056, 2020.
[2]
X. Wang, Y. Peng, L. Lu, Z. Lu, and R. M. Summers, TieNet: Text-image embedding network for common thorax disease classification and reporting in chest X-rays, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 9049–9058.
[3]
A. E. W. Johnson, T. J. Pollard, N. R. Greenbaum, M. P. Lungren, C. Y. Deng, Y. Peng, Z. Lu, R. G. Mark, S. J. Berkowitz, and S. Horng, MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs, arXiv preprint arXiv: 1901.07042, 2019.
[4]

D. Demner-Fushman, M. D. Kohli, M. B. Rosenman, S. E. Shooshan, L. Rodriguez, S. Antani, G. R. Thoma, and C. J. McDonald, Preparing a collection of radiology examinations for distribution and retrieval, J. Am. Med. Inform. Assoc., vol. 23, no. 2, pp. 304–310, 2016.

[5]
P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang, Bottom-up and top-down attention for image captioning and visual question answering, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6077–6086.
[6]
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, Show and tell: A neural image caption generator, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 3156–3164.
[7]

M. Li, R. Liu, F. Wang, X. Chang, and X. Liang, Auxiliary signal-guided knowledge encoder-decoder for medical report generation, World Wide Web, vol. 26, no. 1, pp. 253–270, 2023.

[8]
Y. Li, X. Liang, Z. Hu, and E. P. Xing, Hybrid retrieval-generation reinforced agent for medical image report generation, in Proc. 32nd Conf. Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, 2018, pp. 1–11.
[9]
M. Li, B. Lin, Z. Chen, H. Lin, X. Liang, and X. Chang, Dynamic graph enhanced contrastive learning for chest X-ray report generation, in Proc. 2023 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 2023, pp. 3334–3343.
[10]
Y. Zhang, X. Wang, Z. Xu, Q. Yu, A. Yuille, and D. Xu, When radiology report generation meets knowledge graph, in Proc. AAAI Conf. Artif. Intell., Washington, DC, USA, 2020, pp. 12910–12917.
[11]
J. B. Yuan, H. F. Liao, R. Luo, and J. B. Luo, Automatic radiology report generation based on multi-view image fusion and medical concept enrichment, in Proc. Medical Image Computing and Computer Assisted Intervention—MICCAI 2019 : 22nd Int. Conf., Shenzhen, China, 2019, pp. 721−729.
[12]
Y. Xue, T. Xu, L. Rodney Long, Z. Xue, S. Antani, G. R. Thoma, and X. Huang, Multimodal recurrent model with attention for automated radiology report generation, in Proc. Medical Image Computing and Computer Assisted Intervention—MICCAI 2018 : 21st Int. Conf., Granada, Spain, 2018, pp. 457–466.
[13]
B. Jing, Z. Wang, and E. Xing, Show, describe, and conclude: On exploiting the structure information of chest X-ray reports, arXiv preprint arXiv: 2004.12274, 2020.
[14]
C. Y. Li, X. Liang, Z. Hu, and E. P. Xing, Knowledge-driven encode, retrieve, paraphrase for medical image report generation, in Proc. AAAI Conf. Artif. Intell., Washington, DC, USA, 2019, pp. 6666–6673.
[15]
H. C. Shin, K. Roberts, L. Lu, D. Demner-Fushman, J. Yao, and R. M. Summers, Learning to read chest X-rays: Recurrent neural cascade model for automated image annotation, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 2497–2506.
[16]

S. Yang, X. Wu, S. Ge, S. K. Zhou, and L. Xiao, Knowledge matters: Chest radiology report generation with general and specific knowledge, Med. Image Anal., vol. 80, p. 102510, 2022.

[17]
F. Liu, X. Wu, S. Ge, W. Fan, and Y. Zou, Exploring and distilling posterior and prior knowledge for radiology report generation, arXiv preprint arXiv: 2106.06963, 2021.
[18]
Z. Chen, Y. Shen, Y. Song, and X. Wan, Cross-modal memory networks for radiology report generation, arXiv preprint arXiv: 2204.13258, 2022.
[19]
F. Liu, C. Yin, X. Wu, S. Ge, Y. Zou, P. Zhang, Y. Zou, and X. Sun, Contrastive attention for automatic chest X-ray report generation, arXiv preprint arXiv: 2106.06965, 2021.
[20]
X. Song, X. Zhang, J. Ji, Y. Liu, and P. Wei, Cross-modal contrastive attention model for medical report generation, in Proc. 29th Int. Conf. Computational Linguistics, Gyeongju, Republic of Korea, 2022, pp. 2388–2397.
[21]
Y. J. Chen, W. H. Shen, H. W. Chung, C. H. Chiu, D. C. Juan, T. Y. Ho, C. T. Cheng, M. L. Li, and T. Y. Ho, Representative image feature extraction via contrastive learning pretraining for chest X-ray report generation, arXiv preprint arXiv: 2209.01604, 2022.
[22]
M. Endo, R. Krishnan, V. Krishna, A. Ng, and P. Rajpurkar, Retrieval-based chest X-ray report generation using a pre-trained contrastive language-image model, in Proc. Machine Learning for Health, Virtual Event, 2021, pp. 209–219.
[23]
A. Yan, Z. He, X. Lu, J. Du, E. Chang, A. Gentili, J. McAuley, and C. N. Hsu, Weakly supervised contrastive learning for chest X-ray report generation, arXiv preprint arXiv: 2109.12242, 2021.
[24]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.
[25]
A. Graves, Supervised sequence labelling with recurrent neural networks, PhD dissertation, Language Technologies Institute, Carnegie Mellon University, PA, USA, 2008.
[26]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, arXiv preprint arXiv: 1706.03762, 2017.
[27]

Y. Zhang, Y. Zhang, P. Qi, C. D. Manning, and C. P. Langlotz, Biomedical and clinical English model packages for the Stanza Python NLP library, J. Am. Med. Inform. Assoc., vol. 28, no. 9, pp. 1892–1899, 2021.

[28]
S. Jain, A. Agrawal, A. Saporta, S. Q. Truong, D. N. Duong, T. Bui, P. Chambon, Y. Zhang, M. P. Lungren, A. Y. Ng, et al., RadGraph: Extracting clinical entities and relations from radiology reports, arXiv preprint arXiv: 2106.14463, 2021.
[29]
I. Beltagy, K. Lo, and A. Cohan, SciBERT: A pretrained language model for scientific text, arXiv preprint arXiv: 1903.10676, 2019.
[30]
J. Li, D. Li, C. Xiong, and S. Hoi, BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation, arXiv preprint arXiv: 2201.12086, 2022.
[31]
R. Vedantam, C. L. Zitnick, and D. Parikh, CIDEr: Consensus-based image description evaluation, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 4566–4575.
[32]
K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, BLEU: a method for automatic evaluation of machine translation, in Proc. 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA, USA, 2002, pp. 311–318.
[33]
C. Y. Lin, ROUGE: A package for automatic evaluation of summaries, in Proc. ACL Workshop Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 74–81.
Tsinghua Science and Technology
Pages 600-609
Cite this article:
Yang B, Lei H, Huang H, et al. DPN: Dynamics Priori Networks for Radiology Report Generation. Tsinghua Science and Technology, 2025, 30(2): 600-609. https://doi.org/10.26599/TST.2023.9010134

80

Views

6

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 08 September 2023
Revised: 07 October 2023
Accepted: 31 October 2023
Published: 09 December 2024
© The Author(s) 2025.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return