Journal Home > Volume 28 , Issue 1

Although deep learning methods have recently attracted considerable attention in the medical field, analyzing large-scale electronic health record data is still a difficult task. In particular, the accurate recognition of heart failure is a key technology for doctors to make reasonable treatment decisions. This study uses data from the Medical Information Mart for Intensive Care database. Compared with structured data, unstructured data contain abundant patient information. However, this type of data has unsatisfactory characteristics, e.g., many colloquial vocabularies and sparse content. To solve these problems, we propose the KTI-RNN model for unstructured data recognition. The proposed model overcomes sparse content and obtains good classification results. The term frequency-inverse word frequency (TF-IWF) model is used to extract the keyword set. The latent dirichlet allocation (LDA) model is adopted to extract the topic word set. These models enable the expansion of the medical record text content. Finally, we embed the global attention mechanism and gating mechanism between the bidirectional recurrent neural network (BiRNN) model and the output layer. We call it gated-attention-BiRNN (GA-BiRNN) and use it to identify heart failure from extensive medical texts. Results show that the F1 score of the proposed KTI-RNN model is 85.57%, and the accuracy rate of the proposed KTI-RNN model is 85.59%.


menu
Abstract
Full text
Outline
About this article

KTI-RNN: Recognition of Heart Failure from Clinical Notes

Show Author's information Dengao Li1( )Huiting Ma1Wenjing Li2Baofeng Zhao3Jumin Zhao4Yi Liu4Jian Fu1
College of Data Science, Taiyuan University of Technology, Jinzhong 030600, China
Department of Statistics and Applied Probability, University of California, Santa Barbara, CA 93106, USA
College of Mining Engineering, Taiyuan University of Technology, Taiyuan 030024, China
College of Information and Computer, Taiyuan University of Technology, Jinzhong 030600, China

Abstract

Although deep learning methods have recently attracted considerable attention in the medical field, analyzing large-scale electronic health record data is still a difficult task. In particular, the accurate recognition of heart failure is a key technology for doctors to make reasonable treatment decisions. This study uses data from the Medical Information Mart for Intensive Care database. Compared with structured data, unstructured data contain abundant patient information. However, this type of data has unsatisfactory characteristics, e.g., many colloquial vocabularies and sparse content. To solve these problems, we propose the KTI-RNN model for unstructured data recognition. The proposed model overcomes sparse content and obtains good classification results. The term frequency-inverse word frequency (TF-IWF) model is used to extract the keyword set. The latent dirichlet allocation (LDA) model is adopted to extract the topic word set. These models enable the expansion of the medical record text content. Finally, we embed the global attention mechanism and gating mechanism between the bidirectional recurrent neural network (BiRNN) model and the output layer. We call it gated-attention-BiRNN (GA-BiRNN) and use it to identify heart failure from extensive medical texts. Results show that the F1 score of the proposed KTI-RNN model is 85.57%, and the accuracy rate of the proposed KTI-RNN model is 85.59%.

Keywords: deep learning, diagnosis, heart failure, text classification

References(49)

[1]
G. A. Roth, C. Johnson, A. Abajobir, F. Abd-Allah, S. F. Abera, G. Abyu, M. Ahmed, B. Aksut, T. Alam, K. Alam, et al., Global, regional, and national burden of cardiovascular diseases for 10 causes, 1990 to 2015, Journal of the American College of Cardiology, vol. 70, no. 1, pp. 1–25, 2017.
[2]
C. W. Yancy, M. Jessup, B. Bozkurt, J. Butler, D. E. Casey Jr, M. M. Colvin, M. H. Drazner, G. S. Filippatos, G. C. Fonarow, M. M. Givertz, et al., 2017 ACC/AHA/HFSA focused update of the 2013 ACCF/AHA guideline for the management of heart failure: A report of the American College of Cardiology/American heart association task force on clinical practice guidelines and the heart failure society of America, Journal of the American College of Cardiology, vol. 70, no. 6, pp. 776–803, 2017.
[3]
R. A. Nishimura, C. M. Otto, R. O. Bonow, B. A. Carabello, J. P. Erwin, L. A. Fleisher, H. Jneid, M. J. Mack, C. J. McLeod, P. T. O’Gara, et al., 2017 AHA/ACC focused update of the 2014 AHA/ACC guideline for the management of patients with valvular heart disease: A report of the American College of Cardiology/American heart association task force on clinical practice guidelines, Journal of the American College of Cardiology, vol. 70, no. 2, pp. 252–289, 2017.
[4]
T. Lagu, P. S. Pekow, M. -S. Shieh, M. Stefan, Q. R. Pack, M. A. Kashef, A. R. Atreya, G. Valania, M. T. Slawsky, and P. K. Lindenauer, Validation and comparison of seven mortality prediction models for hospitalized patients with acute decompensated heart failure, Circulation: Heart Failure, vol. 9, no. 8, p. e002912, 2016.
[5]
N. Farré, E. Vela, M. Clèries, M. Bustins, M. Cainzos-Achirica, C. Enjuanes, P. Moliner, S. Ruiz, J. M. Verdú-Rotellar, and J. Comín-Colet, Medical resource use and expenditure in patients with chronic heart failure: A population-based analysis of 88 195 patients, European Journal of Heart Failure, vol. 18, no. 9, pp. 1132–1140, 2016.
[6]
V. Carubelli, G. Cotter, B. Davison, J. Gishe, S. Senger, I. Bonadei, E. Gorga, V. Lazzarini, C. Lombardi, and M. Metra, In-hospital worsening heart failure in patients admitted for acute heart failure, International Journal of Cardiology, vol. 225, pp. 353–361, 2016.
[7]
J. S. Gordin and G. C. Fonarow, New medications for heart failure, Trends in Cardiovascular Medicine, vol. 26, no. 6, pp. 485–492, 2016.
[8]
A. Triantafyllidis, C. Velardo, T. Chantler, S. A. Shah, C. Paton, R. Khorshidi, L. Tarassenko, K. Rahimi, and on behalf of the SUPPORT-HF Investigators, A personalized mobile-based home monitoring system for heart failure: The support-HF study, International Journal of Medical Informatics, vol. 84, no. 10, pp. 743–753, 2015.
[9]
National Heart, Lung, and Blood Institute (NHLBI), Heart failure, https://www.nhlbi.nih.gov/health-topics/heart-failure, 2019.
[10]
I. Sayago-Silva, F. García-López, and J. Segovia-Cubero, Epidemiology of heart failure in Spain over the last 20 years, Revista Española de Cardiología (English Edition), vol. 66, no. 8, pp. 649–656, 2013.
[11]
R. J. Byrd, S. R. Steinhubl, J. Sun, S. Ebadollahi, and W. F. Stewart, Automatic identification of heart failure diagnostic criteria, using text analysis of clinical notes from electronic health records, International Journal of Medical Informatics, vol. 83, no. 12, pp. 983–992, 2014.
[12]
H. Liang, B. Y. Tsui, H. Ni, C. C. S. Valentim, S. L. Baxter, G. Liu, W. Cai, D. S. Kermany, X. Sun, J. Chen, et al., Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence, Nature Medicine, vol. 25, no. 3, pp. 433–438, 2019.
[13]
M. Z. Nezhad, D. Zhu, N. Sadati, K. Yang, and P. Levi, SUBIC: A supervised bi-clustering approach for precision medicine, arXiv preprint arXiv: 1709.09929, 2017.
[14]
E. A. Wang, J. B. Long, K. A. McGinnis, K. H. Wang, C. J. Wildeman, C. Kim, K. B. Bucklen, D. A. Fiellin, J. Bates, C. Brandt, et al., Measuring exposure to incarceration using the electronic health record, Medical Care, vol. 57, pp. S157–S163, 2019.
[15]
M. Jamei, A. Nisnevich, E. Wetchler, S. Sudat, and E. Liu, Predicting all-cause risk of 30-day hospital readmission using artificial neural networks, PloS ONE, vol. 12, no. 7, p. e0181173, 2017.
[16]
C. Xiao, E. Choi, and J. Sun, Opportunities and challenges in developing deep learning models using electronic health records data: A systematic review, Journal of the American Medical Informatics Association, vol. 25, no. 10, pp. 1419–1428, 2018.
[17]
F. Li, W. Liu, and H. Yu, Extraction of information related to adverse drug events from electronic health record notes: Design of an end-to-end model based on deep learning, JMIR Medical Informatics, vol. 6, no. 4, p. e12159, 2018.
[18]
M. S. Sajid, T. Hollingsworth, M. McGlue, and W. F. Miles, Factors influencing the diagnostic accuracy and management in acute surgical patients, World Journal of Gastrointestinal Surgery, vol. 6, no. 11, pp. 229–234, 2014.
[19]
G. E. Simon, E. Johnson, J. M. Lawrence, R. C. Rossom, B. Ahmedani, F. L. Lynch, A. Beck, B. Waitzfelder, R. Ziebell, R. B. Penfold, et al., Predicting suicide attempts and suicide deaths following outpatient visits using electronic health records, American Journal of Psychiatry, vol. 175, no. 10, pp. 951–960, 2018.
[20]
G. E. Simon, S. M. Shortreed, E. Johnson, R. C. Rossom, F. L. Lynch, R. Ziebell, and R. B. Penfold, What health records data are required for accurate prediction of suicidal behavior? Journal of the American Medical Informatics Association, vol. 26, no. 12, pp. 1458–1465, 2019.
[21]
Goh K. H., Wang L., Yeow A. Y. K., Poh H., Li K., Yeow J. J. L., and Tan G. Y. H., Artificial intelligence in sepsis early prediction and diagnosis using unstructured data in healthcare, Nature Communications, vol. 12, no. 1, pp. 110, 2021.10.1038/s41467-021-20910-4
[22]
S. Nuthakki, S. Neela, J. W. Gichoya, and S. Purkayastha, Natural language processing of MIMIC-Ⅲ clinical notes for identifying diagnosis and procedures with neural networks, arXiv preprint arXiv: 1912.12397, 2019.
[23]
R. E. Leiter, E. Santus, Z. Jin, K. C. Lee, M. Yusufov, I. Chien, A. Ramaswamy, E. T. Moseley, Y. Qian, D. Schrag, et al., Deep natural language processing to identify symptom documentation in clinical notes for patients with heart failure undergoing cardiac resynchronization therapy, Journal of Pain and Symptom Management, vol. 60, no. 5, pp. 948–958, 2020.
[24]
C. -H. Huang, J. Yin, and F. Hou, A text similarity measurement combining word semantic information with TF-IDF method, Chinese Journal of Computers, vol. 34, no. 5, pp. 856–864, 2011.
[25]
A. Xiong, D. Liu, H. Tian, Z. Liu, P. Yu, and M. Kadoch, News keyword extraction algorithm based on semantic clustering and word graph model, Tsinghua Science and Technology, vol. 26, no. 6, pp. 886–893, 2021.
[26]
D. M. Blei, A. Y. Ng, and M. I. Jordan, Latent dirichlet allocation, The Journal of Machine Learning Research, vol. 3, pp. 993–1022, 2003.
[27]
X. Han, B. Li, and Z. Wang, An attention-based neural framework for uncertainty identification on social media texts, Tsinghua Science and Technology, vol. 25, no. 1, pp. 117–126, 2019.
[28]
T. -D. Le, R. Noumeir, J. Rambaud, G. Sans, and P. Jouvet, Detecting of a patient’s condition from clinical narratives using natural language representation, arXiv preprint arXiv: 2104.03969, 2021.
[29]
T. Nagamine, B. Gillette, A. Pakhomov, J. Kahoun, H. Mayer, R. Burghaus, J. Lippert, and M. Saxena, Multiscale classification of heart failure phenotypes by unsupervised clustering of unstructured electronic medical record data, Scientific Reports, vol. 10, no. 1, pp. 1–13, 2020.
[30]
N. Leema, H. K. Nehemiah, and A. Kannan, Neural network classifier optimization using differential evolution with global information and back propagation algorithm for clinical datasets, Applied Soft Computing, vol. 49, pp. 834–844, 2016.
[31]
Y. N. Jane, H. K. Nehemiah, and K. Arputharaj, A Q-backpropagated time delay neural network for diagnosing severity of gait disturbances in Parkinson’s disease, Journal of Biomedical Informatics, vol. 60, pp. 169–176, 2016.
[32]
M. S. Saranya, M. Selvi, S. Ganapathy, S. Muthurajkumar, L. S. Ramesh, and A. Kannan, Intelligent medical data storage system using machine learning approach, in Proc. 2016 Eighth International Conference on Advanced Computing (ICoAC), Chennai, India, 2017, pp. 191–195.
[33]
D. M. Blei, Probabilistic topic models, Communications of the ACM, vol. 55, no. 4, pp. 77–84, 2012.
[34]
Y. Chen, H. Zhang, R. Liu, Z. Ye, and J. Lin, Experimental explorations on short text topic mining between LDA and NMF based schemes, Knowledge-Based Systems, vol. 163, pp. 1–13, 2019.
[35]
X. -Y. Jin, D. -X. Pu, Y. -Z. Lan, and L. -J. Li, Medical aided diagnosis using electronic medical records based on LDA and word vector model, in Proc. 2017 4th International Conference on Information Science and Control Engineering (ICISCE), Changsha, China, 2017, pp. 443–445.
[36]
M. Selvi, K. Thangaramya, M. S. Saranya, K. Kulothungan, S. Ganapathy, and A. Kannan, Classification of medical dataset along with topic modeling using LDA, in Nanoelectronics, Circuits and Communication Systems, V. Nath and J. K. Mandal, eds. Singapore: Springer, 2019, pp. 1–11.
DOI
[37]
L. Zhu, I. Reychav, R. McHaney, A. Broda, Y. Tal, and O. Manor, Combined SNA and LDA methods to understand adverse medical events, International Journal of Risk & Safety in Medicine, vol. 30, no. 3, pp. 129–153, 2019.
[38]
H. Y. Gao, J. W. Liu, and S. X. Yang, Identifying topics of online healthcare reviews based on improved LDA, (in Chinese), Transactions of Beijing Institute of Technology, vol. 39, no. 4, pp. 427–434, 2019.
[39]
D. Yan, E. Hua, and B. Hu, An improved single-pass algorithm for Chinese microblog topic detection and tracking, in Proc. 2016 IEEE International Congress on Big Data (BigData Congress), San Francisco, CA, USA, 2016, pp. 251–258.
[40]
T. Zhang, W. Wang, Y. Huang, K. Liu, and X. Hu, Method of real-time keyword extraction from Chinese short-text based on visual hotspot on screen, (in Chinese), Journal of the China Society for Scientific and Technical Information, vol. 35, no. 12, pp. 1313–1322, 2016.
[41]
C. Zheng, W. Wu, and N. Dai, Improved short text classification method based on BTM topic features, Computer Engineering and Applications, vol. 52, no. 13, pp. 95–100, 2016.
[42]
J. L. Leevy, T. M. Khoshgoftaar, and F. Villanustre, Survey on RNN and CRF models for de-identification of medical free text, Journal of Big Data, vol. 7, no. 1, pp. 1–22, 2020.
[43]
Y. Yu, M. Li, L. Liu, Z. Fei, F. Wu, and J. Wang, Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN, Journal of Biomedical Informatics, vol. 91, p. 103114, 2019.
[44]
D. Chen, M. Huang, and W. Li, Knowledge-powered deep breast tumor classification with multiple medical reports, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 18, no. 3, pp. 891–901, 2019.
[45]
J. Ma, C. Che, and Q. Zhang, Medical answer selection based on two attention mechanisms with BiRNN, MATEC Web of Conferences, vol. 176, no. 8, p. 01024, 2018.
[46]
M. Jiang, T. Sanger, and X. Liu, Combining contextualized embeddings and prior knowledge for clinical named entity recognition: Evaluation study, JMIR Medical Informatics, vol. 7, no. 4, p. e14850, 2019.
[47]
J. Pennington, R. Socher, and C. D. Manning, GloVE: Global vectors for word representation, in Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014, pp. 1532–1543.
[48]
M. Schuster and K. K. Paliwal, Bidirectional recurrent neural networks, IEEE transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997.
[49]
A. E. W. Johnson, T. J. Pollard, L. Shen, L. -W. H. Lehman, M. Feng, M. Ghassemi, B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark, MIMIC-Ⅲ, a freely accessible critical care database, Scientific Data, vol. 3, no. 1, pp. 1–9, 2016.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 08 November 2021
Accepted: 22 November 2021
Published: 21 July 2022
Issue date: February 2023

Copyright

© The author(s) 2023.

Acknowledgements

This work was supported by the National Major Scientific Research Instrument Development Project (No. 62027819): High-Speed Real-Time Analyzer for Laser Chip’s Optical Catastrophic Damage Process; the General Object of the National Natural Science Foundation (No. 62076177): Study on the Risk Assessment Model of Heart Failure by Integrating Multi-Modal Big Data; and Shanxi Province Key Technology and Generic Technology R&D Project (No. 2020XXX007): Energy Internet Integrated Intelligent Data Management and Decision Support Platform.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return