[1]
Z. Chen, Y. Song, T. H. Chang, and X. Wan, Generating radiology reports via memory-driven transformer, arXiv preprint arXiv: 2010.16056, 2020.
[2]
X. Wang, Y. Peng, L. Lu, Z. Lu, and R. M. Summers, TieNet: Text-image embedding network for common thorax disease classification and reporting in chest X-rays, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 9049–9058.
[3]
A. E. W. Johnson, T. J. Pollard, N. R. Greenbaum, M. P. Lungren, C. Y. Deng, Y. Peng, Z. Lu, R. G. Mark, S. J. Berkowitz, and S. Horng, MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs, arXiv preprint arXiv: 1901.07042, 2019.
[5]
P. Anderson, X. He, C. Buehler, D. Teney, M. Johnson, S. Gould, and L. Zhang, Bottom-up and top-down attention for image captioning and visual question answering, in Proc. 2018 IEEE/CVF Conf. Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 6077–6086.
[6]
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, Show and tell: A neural image caption generator, in Proc. 2015 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 3156–3164.
[8]
Y. Li, X. Liang, Z. Hu, and E. P. Xing, Hybrid retrieval-generation reinforced agent for medical image report generation, in Proc. 32nd Conf. Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada, 2018, pp. 1–11.
[9]
M. Li, B. Lin, Z. Chen, H. Lin, X. Liang, and X. Chang, Dynamic graph enhanced contrastive learning for chest X-ray report generation, in Proc. 2023 IEEE/CVF Conf. Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 2023, pp. 3334–3343.
[10]
Y. Zhang, X. Wang, Z. Xu, Q. Yu, A. Yuille, and D. Xu, When radiology report generation meets knowledge graph, in Proc. AAAI Conf. Artif. Intell., Washington, DC, USA, 2020, pp. 12910–12917.
[11]
J. B. Yuan, H. F. Liao, R. Luo, and J. B. Luo, Automatic radiology report generation based on multi-view image fusion and medical concept enrichment, in Proc. Medical Image Computing and Computer Assisted Intervention—MICCAI 2019 : 22nd Int. Conf., Shenzhen, China, 2019, pp. 721−729.
[12]
Y. Xue, T. Xu, L. Rodney Long, Z. Xue, S. Antani, G. R. Thoma, and X. Huang, Multimodal recurrent model with attention for automated radiology report generation, in Proc. Medical Image Computing and Computer Assisted Intervention—MICCAI 2018 : 21st Int. Conf., Granada, Spain, 2018, pp. 457–466.
[13]
B. Jing, Z. Wang, and E. Xing, Show, describe, and conclude: On exploiting the structure information of chest X-ray reports, arXiv preprint arXiv: 2004.12274, 2020.
[14]
C. Y. Li, X. Liang, Z. Hu, and E. P. Xing, Knowledge-driven encode, retrieve, paraphrase for medical image report generation, in Proc. AAAI Conf. Artif. Intell., Washington, DC, USA, 2019, pp. 6666–6673.
[15]
H. C. Shin, K. Roberts, L. Lu, D. Demner-Fushman, J. Yao, and R. M. Summers, Learning to read chest X-rays: Recurrent neural cascade model for automated image annotation, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 2497–2506.
[17]
F. Liu, X. Wu, S. Ge, W. Fan, and Y. Zou, Exploring and distilling posterior and prior knowledge for radiology report generation, arXiv preprint arXiv: 2106.06963, 2021.
[18]
Z. Chen, Y. Shen, Y. Song, and X. Wan, Cross-modal memory networks for radiology report generation, arXiv preprint arXiv: 2204.13258, 2022.
[19]
F. Liu, C. Yin, X. Wu, S. Ge, Y. Zou, P. Zhang, Y. Zou, and X. Sun, Contrastive attention for automatic chest X-ray report generation, arXiv preprint arXiv: 2106.06965, 2021.
[20]
X. Song, X. Zhang, J. Ji, Y. Liu, and P. Wei, Cross-modal contrastive attention model for medical report generation, in Proc. 29th Int. Conf. Computational Linguistics, Gyeongju, Republic of Korea, 2022, pp. 2388–2397.
[21]
Y. J. Chen, W. H. Shen, H. W. Chung, C. H. Chiu, D. C. Juan, T. Y. Ho, C. T. Cheng, M. L. Li, and T. Y. Ho, Representative image feature extraction via contrastive learning pretraining for chest X-ray report generation, arXiv preprint arXiv: 2209.01604, 2022.
[22]
M. Endo, R. Krishnan, V. Krishna, A. Ng, and P. Rajpurkar, Retrieval-based chest X-ray report generation using a pre-trained contrastive language-image model, in Proc. Machine Learning for Health, Virtual Event, 2021, pp. 209–219.
[23]
A. Yan, Z. He, X. Lu, J. Du, E. Chang, A. Gentili, J. McAuley, and C. N. Hsu, Weakly supervised contrastive learning for chest X-ray report generation, arXiv preprint arXiv: 2109.12242, 2021.
[24]
K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. 2016 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778.
[25]
A. Graves, Supervised sequence labelling with recurrent neural networks, PhD dissertation, Language Technologies Institute, Carnegie Mellon University, PA, USA, 2008.
[26]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, arXiv preprint arXiv: 1706.03762, 2017.
[28]
S. Jain, A. Agrawal, A. Saporta, S. Q. Truong, D. N. Duong, T. Bui, P. Chambon, Y. Zhang, M. P. Lungren, A. Y. Ng, et al., RadGraph: Extracting clinical entities and relations from radiology reports, arXiv preprint arXiv: 2106.14463, 2021.
[29]
I. Beltagy, K. Lo, and A. Cohan, SciBERT: A pretrained language model for scientific text, arXiv preprint arXiv: 1903.10676, 2019.
[30]
J. Li, D. Li, C. Xiong, and S. Hoi, BLIP: Bootstrapping language-image pre-training for unified vision-language understanding and generation, arXiv preprint arXiv: 2201.12086, 2022.
[31]
R. Vedantam, C. L. Zitnick, and D. Parikh, CIDEr: Consensus-based image description evaluation, in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 2015, pp. 4566–4575.
[32]
K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, BLEU: a method for automatic evaluation of machine translation, in Proc. 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, PA, USA, 2002, pp. 311–318.
[33]
C. Y. Lin, ROUGE: A package for automatic evaluation of summaries, in Proc. ACL Workshop Text Summarization Branches Out, Barcelona, Spain, 2004, pp. 74–81.