Journal Home >

The lack of labeled image data poses a serious challenge to the application of artificial intelligence (AI) in medical image diagnosis. Medical image notes contain valuable patient information that could be used to label images for machine learning tasks. However, most image note texts are unstructured with heterogeneity and short-paragraph characters, which fail traditional keyword-based techniques. We utilized a deep learning approach to recover missing labels for medical image notes automatically by using a combination of deep word embedding and deep neural network classifiers. Bidirectional encoder representations from transformers trained on medical image notes corpus (MinBERT) were proposed. We applied the proposed techniques to two typical classification tasks: Medical image type identification and clinical diagnosis identification. The two methods significantly outperformed baseline methods and presented high accuracies of 99.56 $%$ and 99.72 $%$ in image type identification and of 94.56 $%$ and 92.45 $%$ in clinical diagnosis identification. Visualization analysis further indicated that word embedding could efficiently capture semantic similarities and regularities across diverse expressions. Results indicated that our proposed framework could accurately recover the missing label information of medical images through the automatic extraction of electronic medical record information. Hence, it could serve as a powerful tool for exploring useful training data in various medical AI applications.

Abstract
Full text
Outline

# Classification of Medical Image Notes for Image Labeling by Using MinBERT

Show Author's information Bokai Yang1,3Yujie Yang1,3Qi Li1,3Ye Li1,3Jing Zheng2( )Yunpeng Cai1,3( )
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
Shenzhen Health Development Research and Data Management Center, Shenzhen 518055, China
University of Chinese Academy of Sciences, Beijing 100049, China

## Abstract

The lack of labeled image data poses a serious challenge to the application of artificial intelligence (AI) in medical image diagnosis. Medical image notes contain valuable patient information that could be used to label images for machine learning tasks. However, most image note texts are unstructured with heterogeneity and short-paragraph characters, which fail traditional keyword-based techniques. We utilized a deep learning approach to recover missing labels for medical image notes automatically by using a combination of deep word embedding and deep neural network classifiers. Bidirectional encoder representations from transformers trained on medical image notes corpus (MinBERT) were proposed. We applied the proposed techniques to two typical classification tasks: Medical image type identification and clinical diagnosis identification. The two methods significantly outperformed baseline methods and presented high accuracies of 99.56 $%$ and 99.72 $%$ in image type identification and of 94.56 $%$ and 92.45 $%$ in clinical diagnosis identification. Visualization analysis further indicated that word embedding could efficiently capture semantic similarities and regularities across diverse expressions. Results indicated that our proposed framework could accurately recover the missing label information of medical images through the automatic extraction of electronic medical record information. Hence, it could serve as a powerful tool for exploring useful training data in various medical AI applications.

## Keywords:

MinBERT, convolutional neural network, electronic medical record, medical image labeling, word embedding
Received: 05 January 2022 Revised: 29 April 2022 Accepted: 18 May 2022 Published: 06 January 2023 Issue date: August 2023
References(46)
[1]
D. S. Kermany, M. Goldbaum, W. J. Cai, C. C. S. Valentim, H. Y. Liang, S. L. Baxter, A. Mckeown, G. Yang, X. K. Wu, F. B. Yan, et al., Identifying medical diagnoses and treatable diseases by image-based deep learning, Cell, vol. 172, no. 5, pp. 1122–1131.e9, 2018.
[2]
J. De Fauw, J. R. Ledsam, B. Romera-Paredes, S. Nikolov, N. Tomasev, S. Blackwell, H. Askham, X. Glorot, B. O’Donoghue, D. Visentin, et al, Clinically applicable deep learning for diagnosis and referral in retinal disease, Nat. Med., vol. 24, no. 9, pp. 1342–1350, 2018.
[3]
N. Coudray, P. S. Ocampo, T. Sakellaropoulos, N. Narula, M. Snuderl, D. Fenyö, A. L. Moreira, N. Razavian, and A. Tsirigos, Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning, Nat. Med., vol. 24, no. 10, pp. 1559–1567, 2018.
[4]
R. H. Xu, W. Wei, M. Krawczyk, W. Q. Wang, H. Y. Luo, K. Flagg, S. H. Yi, W. Shi, Q. L. Quan, K. Li, et al., Circulating tumour DNA methylation markers for diagnosis and prognosis of hepatocellular carcinoma, Nat. Mater., vol. 16, no. 11, pp. 1155–1161, 2017.
[5]
S. M. McKinney, M. Sieniek, V. Godbole, J. Godwin, N. Antropova, H. Ashrafian, T. Back, M. Chesus, G. S. Corrado, A. Darzi, et al., International evaluation of an AI system for breast cancer screening, Nature, vol. 577, no. 7788, pp. 89–94, 2020.
[6]
H. Y. Liang, B. Y. Tsui, H. Ni, C. C. S. Valentim, S. L. Baxter, G. J. Liu, W. J. Cai, D. S. Kermany, X. Sun, J. C. Chen, et al., Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence, Nat. Mater., vol. 25, no. 3, pp. 433–438, 2019.
[7]
N. Bien, P. Rajpurkar, R. L. Ball, J. Irvin, A. Park, E. Jones, M. Bereket, B. N. Patel, K. W. Yeom, K. Shpanskaya, et al., Deep-learning-assisted diagnosis for knee magnetic resonance imaging: Development and retrospective validation of MRNet, PLoS Med., vol. 15, no. 11, p. e1002699, 2018.
[8]
A. Park, C. Chute, P. Rajpurkar, J. Lou, R. L. Ball, K. Shpanskaya, R. Jabarkheel, L. H. Kim, E. Mckenna, J. Tseng, et al., Deep learning-assisted diagnosis of cerebral aneurysms using the HeadXNet model, JAMA Netw. Open, vol. 2, no. 6, p. e195600, 2019.
[9]
J. Chen, L. L. Wu, J. Zhang, L. Zhang, D. X. Gong, Y. L. Zhao, Q. X. Chen, S. L. Huang, M. Yang, X. Yang, et al., Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography, Sci. Rep., vol. 10, no. 1, p. 19196, 2020.
[10]
A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. B. Liu, J. Marcus, M. M. Sun, et al., Scalable and accurate deep learning with electronic health records, npj Digital Med., vol. 1, no. 1, p. 18, 2018.
[11]
K. Yan, X. S. Wang, L. Lu, and R. M. Summers, DeepLesion: Automated mining of large-scale lesion annotations and universal lesion detection with deep learning, J. Med. Imaging, vol. 5, no. 3, p. 036501, 2018.
[12]
J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, and F. F. Li, ImageNet: A large-scale hierarchical image database, in Proc. 2009 IEEE Int. Conf. on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, pp. 1175–1181.
[13]
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, Microsoft COCO: Common objects in context, in Proc. 13th European Conf. on Computer Vision, Zurich, Switzerland, 2014, pp. 740–755.
[14]
G. Mujtaba, L. Shuib, R. G. Raj, R. Rajandram, K. Shaikh, and M. A. Al-Garadi, Automatic ICD-10 multi-class classification of cause of death from plaintext autopsy reports through expert-driven feature selection, PLoS One, vol. 12, no. 2, p. e0170242, 2017.
[15]
M. Li, Z. H. Fei, M. Zeng, F. X. Wu, Y. H. Li, Y. Pan, and J. X. Wang, Automated ICD-9 coding via a deep learning approach, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 16, no. 4, pp. 1193–1202, 2019.
[16]
J. Martineau and T. Finin, Delta TFIDF: An improved feature space for sentiment analysis, in Proc.3rd IEEE Int. Conf. on Weblogs and Social Media, San Jose, CA, USA, 2009.
[17]
P. Soucy and G. W. Mineau, Beyond TFIDF weighting for text categorization in the vector space model, in Proc. 19th Int. Joint Conf. on Artificial Intelligence, Edinburgh, UK, 2005, pp. 1130–1135.
[18]
X. M. Ye, X. M. Mao, J. C. Xia, and B. Wang, Improved approach to TF-IDF algorithm in text classification, (in Chinese), Comput. Eng. Appl., vol. 55, no. 2, pp. 104–109, 161, 2019.
[19]
D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent dirichlet allocation, J. Mach. Learn. Res., vol. 3, pp. 993–1022, 2003.
[20]
L. Li and Y. Zhang, An empirical study of text classification using latent dirichlet allocation, Cs. Cmu. Edu., no. 1, 2018.
[21]
T. François and E. Miltsakaki, Do NLP and machine learning improve traditional readability formulas? in Proc. 1st Workshop on Predicting and Improving Text Readability for Target Reader Populations, Montréal, Canada, 2012, pp. 49–57.
[22]
X. E. Liu, X. X. You, X. Zhang, J. Wu, and P. Lv, Tensor graph convolutional networks for text classification, in Proc. 34th AAAI Conf. on Artificial Intelligence, New York, NY, USA, 2020, pp. 8409–8416.
[23]
Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436–444, 2015.
[24]
W. C. Sun, Z. P. Cai, Y. Y. Li, F. Liu, S. Q. Fang, and G. Y. Wang, Data processing and text mining technologies on electronic medical records: A review, J. Healthc. Eng., vol. 2018, p. 4302425, 2018.
[25]
W. C. Sun, Z. P. Cai, F. Liu, S. Q. Fang, and G. Y. Wang, A survey of data mining technology on electronic medical records, in Proc. IEEE 19th Int. Conf. on e-Health Networking, Applications and Services (Healthcom), Dalian, China, 2017, pp. 1–6.
[26]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31st Int. Conf. on Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000–6010.
[27]
J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proc. 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 4171–4186.
[28]
L. Floridi and M. Chiriatti, GPT-3: Its nature, scope, limits, and consequences, Minds Mach., vol. 30, no. 4, pp. 681–694, 2020.
[29]
J. Lee, W. J. Yoon, S. Kim, D. Kim, S. Kim, C. H. So, and J. Kang, BioBERT: A pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
[30]
K. X. Huang, J. Altosaar, and R. Ranganath, ClinicalBERT: Modeling clinical notes and predicting hospital readmission, arXiv preprint arXiv: 1904.05342, 2019.
[31]
X. Rong, word2vec parameter learning explained, arXiv preprint arXiv: 1411.2738, 2014.
[32]
Y. Liu, T. Ge, K. Mathews, H. Ji, and D. McGuinness, Exploiting task-oriented resources to learn word embeddings for clinical abbreviation expansion, in Proc. 2015 Workshop on Biomedical Natural Language Processing, Beijing, China, 2015, pp. 92–97.
[33]
T. Mikolov, Q. V. Le, and I. Sutskever, Exploiting similarities among languages for machine translation, arXiv preprint arXiv: 1309.4168, 2013.
[34]
P. Kim, Convolutional neural network, in MATLAB Deep Learning: With Machine Learning, Neural Networks and Artificial Intelligence, P. Kim, ed. Berkeley, CA, USA: Springer, 2017, pp. 121–147.
[35]
I. Freeman, L. Roese-Koerner, and A. Kummert, Effnet: An efficient structure for convolutional neural networks, in Proc. 25th IEEE Int. Conf. on Image Processing, Athens, Greece, 2018, pp. 6–10.
[36]
Y. Kim, Convolutional neural networks for sentence classification, in Proc. 2014 Conf. on Empirical Methods in Natural Language Processing, Doha, Qatar, 2014, pp. 1746–1751.
[37]
L. Breiman, Random forests, Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001.
[38]
A. Cutler, D. R. Cutler, and J. R. Stevens, Random forests, in Ensemble Machine Learning: Methods and Applications, C. Zhang and Y. Q. Ma, eds. Boston, MA, USA: Springer, 2012, pp. 157–176.
[39]
E. Shelhamer, J. Long, and T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Trans. Patt. Analy. Mach. Intell., vol. 39, no. 4, pp. 640–651, 2017.
[40]
Y. Doval and C. Gómez-Rodríguez, Comparing neural- and N-gram-based language models for word segmentation, J. Assoc. Inform. Sci. Technol., vol. 70, no. 2, pp. 187–197, 2019.
[41]
K. C. Chang and H. T. Chang, Is it possible to use chatbot for the Chinese word segmentation? in Proc. 3rd Int. Conf. on Natural Language Processing and Information Retrieval, Tokushima, Japan, 2019, pp. 20–24.
[42]
H. Saif, M. Fernández, Y. L. He, and H. Alani, On stopwords, filtering and data sparsity for sentiment analysis of twitter, in Proc. 9th Int. Language Resources and Evaluation Conf., Reykjavik, Iceland, 2014, pp. 810–817.
[43]
B. K. Yang, G. Z. Dai, Y. J. Yang, D. R. Tang, Q. Li, D. N. Lin, J. Zheng, and Y. P. Cai, Automatic text classification for label imputation of medical diagnosis notes based on random forest, in Proc. 7th Int. Conf. on Health Information Science, Cairns, Australia, 2018, pp. 87–97.
[44]
D. Kostrzewa and R. Brzeski, Adjusting parameters of the classifiers in multiclass classification, in Proc. 13th Int. Conf. on Beyond Databases, Architectures and Structures. Towards Efficient Solutions for Data Analysis and Knowledge Representation, Ustroń, Poland, 2017, pp. 89–101.
[45]
L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, 2008.
[46]
M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?”: Explaining the predictions of any classifier, in Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 1135–1144.
Publication history
Acknowledgements
Rights and permissions

## Publication history

Revised: 29 April 2022
Accepted: 18 May 2022
Published: 06 January 2023
Issue date: August 2023