Journal Home > Volume 29 , Issue 1

With severe acute respiratory syndrome coronavirus 2 spreading globally and causing 2019 coronavirus disease (COVID-19), a challenge that we unprepared for was about how to optimally plan and distribute limited top-medical resources for patients in need of urgent care. To address this challenge, physicians desperately needed a scientific tool to methodically differentiate between cases with varying severity. In this study, the unique data of COVID-19 intensive care unit (ICU) patients provided by the national medical team in Wuhan were classified into discrete and continuous variable types. All continuous data were discretized using an entropy-based method and transformed into serial information margins, in which each information margin is related to a specific symptom or clinical meaning. Finally, all these native and processed discrete data were used to configure a readable scorecard through logistic regression, which is the desired scientific tool aforementioned. A total of 322 ICU patients (age: [median: 64, interquartile range: 54–75], males: 178 [55.28%], and death: 72 [22.36%]) were included in the study. Probabilities of mortality in COVID-19 patients can be evaluated using a scorecard model (calibration slope: 1.343, Brier: 0.048, Dxy = 0.972, and population stability index = 0.071), with desired model performances (accuracy = 0.948, area under curve = 0.99, sensitivity = 1, and specificity = 0.939). This new model can interpret clinical meanings from complex data, and compare it with existing machine learning methods through a black-box mechanism. This new data-information model answers a critical question of how a computing algorithm produces clinically meaningful results that will help physicians logically allocate medical resources for COVID-19 patients. Notably, this tool has limitations, giving that this research is a retrospective study. Hopefully, this tool will be tested further and optimized for adaptation to similar clinical cases in the future.


menu
Abstract
Full text
Outline
About this article

Lesson Learned from COVID-19 Retrospective Study: An Entropy-Based Clinical-Interpretable Scorecard for Mortality Risk Control at ICU Admission

Show Author's information Chong Yao1Chonghui Huangqi2Anpeng Huang3( )
Laboratory of Network Information Security, Beihang University, Beijing 100191, China
Andrew and Erna Viterbi School of Engineering, University of Southern California, Los Angles, CA 90089, USA
Beijing Goodwill Information and Technology Co., Ltd., and Mobile Health Laboratory, Peking University, Beijing 100871, China

Abstract

With severe acute respiratory syndrome coronavirus 2 spreading globally and causing 2019 coronavirus disease (COVID-19), a challenge that we unprepared for was about how to optimally plan and distribute limited top-medical resources for patients in need of urgent care. To address this challenge, physicians desperately needed a scientific tool to methodically differentiate between cases with varying severity. In this study, the unique data of COVID-19 intensive care unit (ICU) patients provided by the national medical team in Wuhan were classified into discrete and continuous variable types. All continuous data were discretized using an entropy-based method and transformed into serial information margins, in which each information margin is related to a specific symptom or clinical meaning. Finally, all these native and processed discrete data were used to configure a readable scorecard through logistic regression, which is the desired scientific tool aforementioned. A total of 322 ICU patients (age: [median: 64, interquartile range: 54–75], males: 178 [55.28%], and death: 72 [22.36%]) were included in the study. Probabilities of mortality in COVID-19 patients can be evaluated using a scorecard model (calibration slope: 1.343, Brier: 0.048, Dxy = 0.972, and population stability index = 0.071), with desired model performances (accuracy = 0.948, area under curve = 0.99, sensitivity = 1, and specificity = 0.939). This new model can interpret clinical meanings from complex data, and compare it with existing machine learning methods through a black-box mechanism. This new data-information model answers a critical question of how a computing algorithm produces clinically meaningful results that will help physicians logically allocate medical resources for COVID-19 patients. Notably, this tool has limitations, giving that this research is a retrospective study. Hopefully, this tool will be tested further and optimized for adaptation to similar clinical cases in the future.

Keywords: COVID-19, machine learning, scorecard, clinical-interpretable, ICU admission control

References(39)

[1]
M. Esai Selvan, Risk factors for death from COVID-19, Nat. Rev. Immunol., vol. 20, no. 7, p. 407, 2020.
[2]
Y. Shang, T. Liu, Y. Wei, J. Li, L. Shao, M. Liu, Y. Zhang, Z. Zhao, H. Xu, Z. Peng, et al., Scoring systems for predicting mortality for severe patients with COVID-19, eClinicalMedicine, vol. 24, p. 100426, 2020.
[3]
K. A. Overmyer, E. Shishkova, I. J. Miller, J. Balnis, M. N. Bernstein, T. M. Peters-Clarke, J. G. Meyer, Q. Quan, L. K. Muehlbauer, E. A. Trujillo, et al., Large-scale multi-omic analysis of COVID-19 severity, Clin. Transl. Discov., vol. 12, no. 1, pp. 23–40, 2021.
[4]
G. Zhang, Y. An, L. Zhang, L. Xie, and X. Guo, Risk factors for in-hospital mortality in patients with cancer and COVID-19, Lancet Oncol., vol. 21, no. 9, p. 407, 2020.
[5]
J. Tian, X. Yuan, J. Xiao, Q. Zhong, C. Yang, B. Liu, Y. Cai, Z. Lu, J. Wang, Y. Wang, et al., Clinical characteristics and risk factors associated with COVID-19 disease severity in patients with cancer in Wuhan, China: A multicentre, retrospective, cohort study, Lancet Oncol., vol. 21, no. 7, pp. 893–903, 2020.
[6]
W. Liang, H. Liang, L. Ou, B. Chen, A. Chen, C. Li, Y. Li, W. Guan, L. Sang, J. Lu, et al., Development and validation of a clinical risk score to predict the occurrence of critical illness in hospitalized patients with COVID-19, JAMA Intern. Med., vol. 180, no. 8, pp. 1081–1089, 2020.
[7]
L. Yan, H. T. Zhang, J. Goncalves, Y. Xiao, M. Wang, Y. Guo, C. Sun, X. Tang, L. Jing, M. Zhang, et al., An interpretable mortality prediction model for COVID-19 patients, Nat. Mach. Intell., vol. 2, no. 5, pp. 283–288, 2020.
[8]
Y. Gao, G. Y. Cai, W. Fang, H. Y. Li, S. Y. Wang, L. Chen, Y. Yu, D. Liu, S. Xu, P. F. Cui, et al., Machine learning based early warning system enables accurate mortality risk prediction for COVID-19, Nat. Commun., vol. 11, no. 1, p. 5033, 2020.
[9]
A. S. Yadaw, Y. C. Li, S. Bose, R. Iyengar, S. Bunyavanich, and G. Pandey, Clinical features of COVID-19 mortality: Development and validation of a clinical prediction model, Lancet Digit. Heath., vol. 2, no. 10, pp. 516–525, 2020.
[10]
S. R. Knight, A. Ho, R. Pius, I. Buchan, G. Carson, T. M. Drake, J. Dunning, C. J. Fairfield, C. Gamble, C. A. Green, et al., Risk stratification of patients admitted to hospital with COVID-19 using the ISARIC WHO Clinical Characterisation Protocol: Development and validation of the 4C Mortality Score, BMJ Clin. Res. Ed., vol. 370, p. 3339, 2020.
[11]
N. Razavian, V. J. Major, M. Sudarshan, J. Burk-Rafel, P. Stella, H. Randhawa, S. Bilaloglu, J. Chen, V. Nguy, W. Wang, et al., A validated, real-time prediction model for favorable outcomes in hospitalized COVID-19 patients, NPJ Digit. Med., vol. 3, p. 130, 2020.
[12]
O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R. B. Altman, Missing value estimation methods for DNA microarrays, Bioinform. Oxf. Engl., vol. 17, no. 6, pp. 520–525, 2001.
[13]
J. Gupta, S. Paul, and A. Ghosh, A novel transfer learning-based missing value imputation on discipline diverse real test datasets—A comparative study with different machine learning algorithms, in Advances in Intelligent Systems and Computing, Singapore: Springer, 2019.
[14]
, Rajan Vohra, and , Missing value imputation in multi attribute data set, Int. J. Comput. Sci. Infor. Technol., vol. 5, no. 4, pp. 5315–5321, 2014.
[15]
G. E. A. P. A. Batista and M. C. Monard, An analysis of four missing data treatment methods for supervised learning, Appl. Artif. Intell., vol. 17, nos. 5–6, pp. 519–533, 2003.
[16]
J. Dougherty, R. Kohavi, and Sahami M., Supervised and unsupervised discretization of continuous features, in Proc. 12th Int. Conf. Machine Learning, Tahoe City, CA, USA: Morgan Kaufmann, 1995.
[17]
G. Zeng, Metric divergence measures and information value in credit scoring, J. Math., vol. 2013, pp. 1–10, 2013.
[18]
M. Refaat, Credit Risk Scorecards: Development and Implementation Using SAS, Raleigh, NC, USA: LULU.COM, 2011.
[19]
N. Siddiqi, Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring, Hoboken, NJ, USA: John Wiley & Sons, Inc., 2012.
DOI
[20]
E. W. Steyerberg, A. J. Vickers, N. R. Cook, T. Gerds, M. Gonen, N. Obuchowski, M. J. Pencina, and M. W. Kattan, Assessing the performance of prediction models: A framework for traditional and novel measures, Epidemiol. Camb. Mass, vol. 21, no. 1, pp. 128–138, 2010.
[21]
R. Taplin and C. Hunt, The population accuracy index: A new measure of population stability for model monitoring, Risks, vol. 7, no. 2, p. 53, 2019.
[22]
A. J. Vickers and E. B. Elkin, Decision curve analysis: A novel method for evaluating prediction models, medical decision making, Med. Decis. Making, vol. 26, no. 6, pp. 565–574, 2006.
[23]
A. J. Vickers, B. van Calster, and E. W. Steyerberg, A simple, step-by-step guide to interpreting decision curve analysis, Diagn. Progn. Res., vol. 3, p. 18, 2019.
[24]
M. Fitzgerald, B. R. Saville, and R. J. Lewis, Decision curve analysis, JAMA, vol. 313, no. 4, p. 409, 2015.
[25]
H. He, Y. Bai, E. A. Garcia, and S. Li, ADASYN: Adaptive synthetic sampling approach for imbalanced learning, in Proc. 2008 IEEE Int. Joint Conf. Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 2008, pp. 1322–1328.
[26]
F. Caramelo, N. Ferreira, and B. Oliveiros, Estimation of risk factors for COVID-19 mortality-preliminary results, medRxiv, https://europepmc.org/article/PPR/PPR114369, 2020.
DOI
[27]
B. Zheng, Y. Cai, F. Zeng, M. Lin, J. Zheng, W. Chen, G. Qin, and Y. Guo, An interpretable model-based prediction of severity and crucial factors in patients with COVID-19, BioMed Res. Int., vol. 2021, pp. 1–9, 2021.
[28]
J. A. Kline, C. A. Camargo, D. M. Courtney, C. Kabrhel, K. E. Nordenholz, T. Aufderbeide, J. J. Baugh, D. G. Beiser, C. L. Bennett, J. Bledsoe, et al., Clinical prediction rule for SARS-CoV-2 infection from 116 U.S. emergency departments 2-22-2021, PloS One, vol. 16, no. 3, p. 0248438, 2021.
[29]
K. B. Son, T. J. Lee, and S. S. Hwang, Disease severity classification and COVID-19 outcomes, Republic of Korea, Bull. World Heath. Organ., vol. 99, no. 1, pp. 62–66, 2021.
[30]
M. Laforge, C. Elbim, C. Frère, M. Hémadi, C. Massaad, P. Nuss, J. J. Benoliel, and C. Becker, Tissue damage from neutrophil-induced oxidative stress in COVID-19, Nat. Rev. Immunol., vol. 20, no. 9, pp. 515–516, 2020.
[31]
B. Kalyanaraman, Do free radical network and oxidative stress disparities in African Americans enhance their vulnerability to SARS-CoV-2 infection and COVID-19 severity? Redox Biol., vol. 37, p. 101721, 2020.
[32]
P. Pan, Y. Li, Y. Xiao, B. Han, L. Su, M. Su, Y. Li, S. Zhang, D. Jiang, X. Chen, et al., Prognostic assessment of COVID-19 in the intensive care unit by machine learning methods: Model development and validation, J. Med. Internet Res., vol. 22, no. 11, p. 23128, 2020.
[33]
A. Alnor, M. B. Sandberg, C. Gils, and P. J. Vinholt, Laboratory tests and outcome for patients with coronavirus disease 2019: A systematic review and meta-analysis, J. Appl. Lab. Med., vol. 5, no. 5, pp. 1038–1049, 2020.
[34]
X. Zhang, Y. Tan, Y. Ling, G. Lu, F. Liu, Z. Yi, X. Jia, M. Wu, B. Shi, S. Xu, et al., Viral and host factors related to the clinical outcome of COVID-19, Nature, vol. 583, no. 7816, pp. 437–440, 2020.
[35]
D. Wang, B. Hu, C. Hu, F. Zhu, X. Liu, J. Zhang, B. Wang, H. Xiang, Z. Cheng, Y. Xiong, et al., Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus–infected pneumonia in Wuhan, China, JAMA, vol. 323, no. 11, p. 1061, 2020.
[36]
F. He, Y. Quan, M. Lei, R. Liu, S. Qin, J. Zeng, Z. Zhao, N. Yu, L. Yang, and J. Cao, Clinical features and risk factors for ICU admission in COVID-19 patients with cardiovascular diseases, Aging Dis., vol. 11, no. 4, p. 763, 2020.
[37]
W. S. Lim, M. M. van der Eerden, R. Laing, W. G. Boersma, N. Karalus, G. I. Town, S. A. Lewis, and J. T. MacFarlane, Defining community acquired pneumonia severity on presentation to hospital: An international derivation and validation study, Thorax, vol. 58, no. 5, pp. 377–382, 2003.
[38]
M. J. Fine, T. E. Auble, D. M. Yealy, B. H. Hanusa, L. A. Weissfeld, D. E. Singer, C. M. Coley, T. J. Marrie, and W. N. Kapoor, A prediction rule to identify low-risk patients with community-acquired pneumonia, Dev. Camb. Engl., vol. 336, no. 4, pp. 243–250, 1997.
[39]
J. L. Liu, F. Xu, H. Zhou, X. J. Wu, L. X. Shi, R. Q. Lu, A. Farcomeni, M. Venditti, Y. L. Zhao, S. Y. Luo, et al., Expanded CURB-65: A new score system predicts severity of community-acquired pneumonia with superior efficiency, Sci. Rep., vol. 6, p. 22911, 2016.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 27 November 2022
Revised: 27 March 2023
Accepted: 09 May 2023
Published: 21 August 2023
Issue date: February 2024

Copyright

© The author(s) 2024.

Acknowledgements

This work was supported in part by the Scientific and Technological Innovation 2030- "New Generation Artificial Intelligence" Major Project (No. 2021ZD0140406), and the National Natural Science Foundation of China (No. 62041201). Qiang Ji contributed figure drawing and data analysis assistance to the article.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return