AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (2.2 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Machine Learning for Selecting Important Clinical Markers of Imaging Subgroups of Cerebral Small Vessel DiseaseBased on a Common Data Model

IT Center, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, Beijing 100070, China
West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610044, China
College of Artificial Intelligence, Nanjing Agricultural University, Nanjing 210095, China
Show Author Information

Abstract

Differences in the imaging subgroups of cerebral small vessel disease (CSVD) need to be further explored. First, we use propensity score matching to obtain balanced datasets. Then random forest (RF) is adopted to classify the subgroups compared with support vector machine (SVM) and extreme gradient boosting (XGBoost), and to select the features. The top 10 important features are included in the stepwise logistic regression, and the odds ratio (OR) and 95% confidence interval (CI) are obtained. There are 41 290 adult inpatient records diagnosed with CSVD. Accuracy and area under curve (AUC) of RF are close to 0.7, which performs best in classification compared to SVM and XGBoost. OR and 95% CI of hematocrit for white matter lesions (WMLs), lacunes, microbleeds, atrophy, and enlarged perivascular space (EPVS) are 0.9875 (0.9857−0.9893), 0.9728 (0.9705−0.9752), 0.9782 (0.9740−0.9824), 1.0093 (1.0081−1.0106), and 0.9716 (0.9597−0.9832). OR and 95% CI of red cell distribution width for WMLs, lacunes, atrophy, and EPVS are 0.9600 (0.9538−0.9662), 0.9630 (0.9559−0.9702), 1.0751 (1.0686−1.0817), and 0.9304 (0.8864−0.9755). OR and 95% CI of platelet distribution width for WMLs, lacunes, and microbleeds are 1.1796 (1.1636−1.1958), 1.1663 (1.1476−1.1853), and 1.0416 (1.0152−1.0687). This study proposes a new analytical framework to select important clinical markers for CSVD with machine learning based on a common data model, which has low cost, fast speed, large sample size, and continuous data sources.

References

[1]
J. M. Wardlaw, E. E. Smith, G. J. Biessels, C. Cordonnier, F. Fazekas, R. Frayne, R. I. Lindley, J. T. O'Brien, F. Barkhof, O. R. Benavente, et al., Neuroimaging standards for research into small vessel disease and its contribution to ageing and neurodegeneration, Lancet Neurol., vol. 12, no. 8, pp. 822–838, 2013.
[2]
D. Liu, X. Cai, Y. Yang, S. Wang, D. Yao, L. Mei, J. Jing, S. Li, H. Yan, X. Meng, et al., Associations of life’s simple 7 with cerebral small vessel disease, Stroke, vol. 53, no. 9, pp. 2859–2867, 2022.
[3]
H. Chen, Y. Pan, L. Zong, J. Jing, X. Meng, Y. Xu, H. Yan, X. Zhao, L. Liu, H. Li, et al., Cerebral small vessel disease or intracranial large vessel atherosclerosis may carry different risk for future strokes, Stroke Vasc. Neurol., vol. 5, no. 2, pp. 128–137, 2020.
[4]
L. Jiang, X. Cai, D. Yao, J. Jing, L. Mei, Y. Yang, S. Li, A. Jin, X. Meng, H. Li, et al., Association of inflammatory markers with cerebral small vessel disease in community-based population, J. Neuroinflammation, vol. 19, no. 1, p. 106, 2022.
[5]
Y. Gao, D. Li, J. Lin, A. M. Thomas, J. Miao, D. Chen, S. Li, and C. Chu, Cerebral small vessel disease: Pathological mechanisms and potential therapeutic targets, Front. Aging Neurosci., vol. 14, p. 961661, 2022.
[6]
W. S. Ryu, S. H. Lee, C. K. Kim, B. J. Kim, H. M. Kwon, and B. W. Yoon, High serum alkaline phosphatase in relation to cerebral small vessel disease, Atherosclerosis, vol. 232, no. 2, pp. 313–318, 2014.
[7]
H. B. Lee, J. Kim, S. H. Kim, S. Kim, O. J. Kim, and S. H. Oh, Association between serum alkaline phosphatase level and cerebral small vessel disease, PLoS One, vol. 10, no. 11, p. e0143355, 2015.
[8]
X. Piao, Z. Jie, and W. Yue, Serum alkaline phosphatase level is correlated with the incidence of cerebral small vessel disease, Clin. Invest. Med., vol. 42, no. 1, pp. E47–E52, 2019.
[9]
M. Wada, H. Nagasawa, K. Kurita, S. Koyama, S. Arawaka, T. Kawanami, K. Tajima, M. Daimon, and T. Kato, Cerebral small vessel disease and C-reactive protein: Results of a cross-sectional study in community-based Japanese elderly, J. Neurol. Sci., vol. 264, nos. 1 & 2, pp. 43–49, 2008.
[10]
S. Mitaki, A. Nagai, H. Oguro, and S. Yamaguchi, C-reactive protein levels are associated with cerebral small vessel-related lesions, Acta Neurol. Scand., vol. 133, no. 1, pp. 68–74, 2016.
[11]
A. Hassan, B. J. Hunt, M. O'Sullivan, R. Bell, R. D'Souza, S. Jeffery, J. M. Bamford, and H. S. Markus, Homocysteine is a risk factor for cerebral small vessel disease, acting via endothelial dysfunction, Brain, vol. 127, no. 1, pp. 212–219, 2004.
[12]
K. W. Nam, H. M. Kwon, H. Y. Jeong, J. H. Park, H. Kwon, and S. M. Jeong, Serum homocysteine level is related to cerebral small vessel disease in a healthy population, Neurology, vol. 92, no. 4, pp. e317–e325, 2019.
[13]
Y. Cao, N. Su, D. Zhang, L. Zhou, M. Yao, S. Zhang, L. Cui, Y. Zhu, and J. Ni, Correlation between total homocysteine and cerebral small vessel disease: A Mendelian randomization study, Eur. J. Neurol., vol. 28, no. 6, pp. 1931–1938, 2021.
[14]
M. Wada, H. Nagasawa, K. Kurita, S. Koyama, S. Arawaka, T. Kawanami, K. Tajima, M. Daimon, and T. Kato, Microalbuminuria is a risk factor for cerebral small vessel disease in community-based elderly subjects, J. Neurol. Sci., vol. 255, nos. 1&2, pp. 27–34, 2007.
[15]
P. W. Chung, K. Y. Park, J. M. Kim, D. W. Shin, M. S. Park, Y. J. Chung, S. Y. Ha, S. W. Ahn, H. W. Shin, Y. B. Kim, et al., 25-hydroxyvitamin D status is associated with chronic cerebral small vessel disease, Stroke, vol. 46, no. 1, pp. 248–251, 2015.
[16]
S. E. Park, H. Kim, J. Lee, N. K. Lee, J. W. Hwang, J. J. Yang, B. S. Ye, H. Cho, H. J. Kim, Y. J. Kim, et al., Decreased hemoglobin levels, cerebral small-vessel disease, and cortical atrophy: Among cognitively normal elderly women and men, Int. Psychogeriatr., vol. 28, no. 1, pp. 147–156, 2016.
[17]
A. Vilar-Bergua, I. Riba-Llena, N. Ramos, X. Mundet, E. Espinel, A. López-Rueda, E. Ostos, D. Seron, J. Montaner, and P. Delgado, Microalbuminuria and the combination of MRI markers of cerebral small vessel disease, Cerebrovasc. Dis., vol. 42, nos. 1&2, pp. 66–72, 2016.
[18]
J. Kim, S. J. Yoon, M. H. Woo, S. H. Kim, N. K. Kim, J. Kim, O. J. Kim, and S. H. Oh, Differential impact of serum total bilirubin level on cerebral atherosclerosis and cerebral small vessel disease, PLoS One, vol. 12, no. 3, p. e0173736, 2017.
[19]
Z. G. Yin, Q. S. Wang, K. Yu, W. W. Wang, H. Lin, and Z. H. Yang, Sex differences in associations between blood lipids and cerebral small vessel disease, Nutr. Metab. Cardiovasc. Dis., vol. 28, no. 1, pp. 28–34, 2018.
[20]
K. W. Nam, H. M. Kwon, H. Y. Jeong, J. H. Park, H. Kwon, and S. M. Jeong, High triglyceride-glucose index is associated with subclinical cerebral small vessel disease in a healthy population: A cross-sectional study, Cardiovasc. Diabetol., vol. 19, no. 1, p. 53, 2020.
[21]
J. Kang, W. Luo, C. Zhang, Y. Ren, L. Cao, J. Wu, and H. Li, Positive association between serum insulin-like growth factor-1 and cognition in patients with cerebral small vessel disease, J. Stroke Cerebrovasc. Dis., vol. 30, no. 7, p. 105790, 2021.
[22]
M. Chu, Y. Cai, J. Zhong, Y. Qian, Y. Cen, M. Dou, G. Chen, B. Sun, and X. Lu, Subclinical hypothyroidism is associated with basal Ganglia enlarged perivascular spaces and overall cerebral small vessel disease load, Quant. Imaging Med. Surg., vol. 12, no. 2, pp. 1475–1483, 2022.
[23]
T. Oberheiden, C. Blahak, X. D. Nguyen, M. Fatar, E. Elmas, N. Morper, C. E. Dempfle, H. Bäzner, M. Hennerici, M. Borggrefe, et al., Activation of platelets and cellular coagulation in cerebral small-vessel disease, Blood Coagul. Fibrinolysis, vol. 21, no. 8, pp. 729–735, 2010.
[24]
M. F. A. Karel, M. G. C. H. Roosen, B. M. E. Tullemans, C. E. Zhang, J. Staals, J. M. E. M. Cosemans, and R. R. Koenen, Characterization of cerebral small vessel disease by neutrophil and platelet activation markers using artificial intelligence, J. Neuroimmunol., vol. 367, p. 577863, 2022.
[25]
M. Garza, G. Del Fiol, J. Tenenbaum, A. Walden, and M. N. Zozus, Evaluating common data models for use with a longitudinal community registry, J. Biomed. Inform., vol. 64, pp. 333–341, 2016.
[26]
T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, A survey on missing data in machine learning, J. Big Data, vol. 8, no. 1, p. 140, 2021.
[27]
R. Wei, J. Wang, M. Su, E. Jia, S. Chen, T. Chen, and Y. Ni, Missing value imputation approach for mass spectrometry-based metabolomics data, Sci. Rep., vol. 8, no. 1, p. 663, 2018.
[28]
A. Bihorac, T. Ozrazgat-Baslanti, A. Ebadi, A. Motaei, M. Madkour, P. M. Pardalos, G. Lipori, W. R. Hogan, P. A. Efron, F. Moore, et al., MySurgeryRisk: Development and validation of a machine-learning risk algorithm for major complications and death after surgery, Ann. Surg., vol. 269, no. 4, pp. 652–662, 2019.
[29]
K. Morita, Introduction to multiple imputation, Ann. Clin. Epidemiol., vol. 3, no. 1, pp. 1–4, 2021.
[30]
M. Jakubowski, Latent variables and propensity score matching: A simulation study with application to data from the Programme for International Student Assessment in Poland, Empir. Econ., vol. 48, no. 3, pp. 1287–1325, 2015.
[31]
L. Lan, Q. Guo, Z. Zhang, W. Zhao, X. Yang, H. Lu, Z. Zhou, and X. Zhou, Classification of infected necrotizing pancreatitis for surgery within or beyond 4 weeks using machine learning, Front. Bioeng. Biotechnol., vol. 8, p. 541, 2020.
[32]
N. Shi, L. Lan, J. Luo, P. Zhu, T. R. W. Ward, P. Szatmary, R. Sutton, W. Huang, J. A. Windsor, X. Zhou, et al., Predicting the need for therapeutic intervention and mortality in acute pancreatitis: A two-center international study using machine learning, J. Pers. Med., vol. 12, no. 4, p. 616, 2022.
[33]
L. Breiman, Random forests, Mach. Learn., vol. 45, pp. 5–32, 2001.
[34]
Y. Feng, Support vector machine for stroke risk prediction, Highlights Sci. Eng. Technol., vol. 38, pp. 917–923, 2023.
[35]
S. Narayanpethkar, M. Rishitha, S. Chandana, and D. T. V. Saradhi, Detection of Parkinson’s disease using XGBOOST algorithm, Int. J. Res. Appl. Sci. Eng. Technol., vol. 10, no. 12, pp. 1576–1590, 2022.
[36]
R. Díaz-Uriarte and S. Alvarez de Andrés, Gene selection and classification of microarray data using random forest, BMC Bioinformatics, vol. 7, p. 3, 2006.
[37]
S. Bernard, L. Heutte, and S. Adam, Influence of hyperparameters on random forest accuracy, in Proc. Int. Workshop on Multiple Classifier Systems, Reykjavik, Iceland, 2009, pp. 171–180.
[38]
B. A. Goldstein, E. C. Polley, and F. B. S. Briggs, Random forests for genetic association studies, Stat. Appl. Genet. Mol. Biol., vol. 10, no. 1, p. 32, 2011.
[39]
A. Worster, J. Fan, and A. Ismaila, Understanding linear and logistic regression analyses, CJEM, vol. 9, no. 2, pp. 111–113, 2007.
[40]
V. Wiwanitkit, Plateletcrit, mean platelet volume, platelet distribution width: Its expected values and correlation with parallel red blood cell parameters, Clin. Appl. Thromb. Hemost., vol. 10, no. 2, pp. 175–178, 2004.
[41]
Y. U. Budak, M. Polat, and K. Huysal, The use of platelet indices, plateletcrit, mean platelet volume and platelet distribution width in emergency non-traumatic abdominal surgery: A systematic review, Biochem. Med., vol. 26, no. 2, pp. 178–193, 2016.
[42]
K. Pogorzelska, A. Krętowska, M. Krawczuk-Rybak, and M. Sawicka-Żukowska, Characteristics of platelet indices and their prognostic significance in selected medical condition-a systematic review, Adv. Med. Sci., vol. 65, no. 2, pp. 310–315, 2020.
[43]
P. R. Sarma, Red cell indices, in Clinical Methods : The History, Physical, and Laboratory Examinations. 3rd ed. H. K. Walker, W. D. Hall, and J. W. Hurst, eds. Bethesda, MD, USA: National Library of Medicine, 1990.
[44]
J. Liu, D. Wang, J. Li, Y. Xiong, B. Liu, C. Wei, S. Wu, and M. Liu, High serum alkaline phosphatase levels in relation to multi-cerebral microbleeds in acute ischemic stroke patients with atrial fibrillation and/or rheumatic heart disease, Curr. Neurovasc. Res., vol. 13, no. 4, pp. 303–308, 2016.
Tsinghua Science and Technology
Pages 1495-1508
Cite this article:
Lan L, Hu G, Li R, et al. Machine Learning for Selecting Important Clinical Markers of Imaging Subgroups of Cerebral Small Vessel DiseaseBased on a Common Data Model. Tsinghua Science and Technology, 2024, 29(5): 1495-1508. https://doi.org/10.26599/TST.2023.9010092

507

Views

202

Downloads

0

Crossref

0

Web of Science

0

Scopus

0

CSCD

Altmetrics

Received: 12 December 2022
Revised: 10 June 2023
Accepted: 05 July 2023
Published: 02 May 2024
© The Author(s) 2024.

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return