Journal Home > Volume 5 , Issue 4

The unprecedented coronavirus disease 2019 (COVID-19) pandemic is still raging (in year 2021) in many countries worldwide. Various response strategies to study the characteristics and distributions of the virus in various regions of the world have been developed to assist in the prevention and control of this epidemic. Descriptive statistics and regression analysis on COVID-19 data from different countries were conducted in this study to compare and evaluate various regression models. Results showed that the extreme random forest regression (ERFR) model had the best performance, and factors such as population density, ozone, median age, life expectancy, and Human Development Index (HDI) were relatively influential on the spread and diffusion of COVID-19 in the ERFR model. In addition, the epidemic clustering characteristics were analyzed through the spectral clustering algorithm. The visualization results of spectral clustering showed that the geographical distribution of global COVID-19 pandemic spread formation was highly clustered, and its clustering characteristics and influencing factors also exhibited some consistency in distribution. This study aims to deepen the understanding of the international community regarding the global COVID-19 pandemic to develop measures for countries worldwide to mitigate potential large-scale outbreaks and improve the ability to respond to such public health emergencies.


menu
Abstract
Full text
Outline
About this article

Influencing Factors and Clustering Characteristics of COVID-19: A Global Analysis

Show Author's information Tianlong ZhengChunli ZhangYueting ShiDebao ChenSheng Liu( )
School of Computer Science and Technology, Huaibei Normal University, Huaibei 235000, China
School of Economics and Management, Tiangong University, Tianjin 300000, China

Abstract

The unprecedented coronavirus disease 2019 (COVID-19) pandemic is still raging (in year 2021) in many countries worldwide. Various response strategies to study the characteristics and distributions of the virus in various regions of the world have been developed to assist in the prevention and control of this epidemic. Descriptive statistics and regression analysis on COVID-19 data from different countries were conducted in this study to compare and evaluate various regression models. Results showed that the extreme random forest regression (ERFR) model had the best performance, and factors such as population density, ozone, median age, life expectancy, and Human Development Index (HDI) were relatively influential on the spread and diffusion of COVID-19 in the ERFR model. In addition, the epidemic clustering characteristics were analyzed through the spectral clustering algorithm. The visualization results of spectral clustering showed that the geographical distribution of global COVID-19 pandemic spread formation was highly clustered, and its clustering characteristics and influencing factors also exhibited some consistency in distribution. This study aims to deepen the understanding of the international community regarding the global COVID-19 pandemic to develop measures for countries worldwide to mitigate potential large-scale outbreaks and improve the ability to respond to such public health emergencies.

Keywords: COVID-19, data analysis, extreme random forest regression, spectral clustering, HDI

References(54)

[1]
C. R. Wells, P. Sah, S. M. Moghadas, A. Pandey, A. Shoukat, Y. Wang, Z. Wang, L. A. Meyers, B. H. Singer, and A. P. Galvani, Impact of international travel and border control measures on the global spread of the novel 2019 coronavirus outbreak, Proc. Natl. Acad. Sci. USA, vol. 117, no. 13, pp. 7504–7509, 2020.
[2]
D. S. Candido, I. M. Claro, J. G. De Jesus, W. M. Souza, F. R. R. Moreira, S. Dellicour, T. A. Mellan, L. Du Plessis, R. H. M. Pereira, F. C. S. Sales, et al., Evolution and epidemic spread of SARS-CoV-2 in Brazil, Science, vol. 369, no. 6508, pp. 1255–1260, 2020.
[3]
G. E. Patterson, K. M. McIntyre, H. E. Clough, and J. Rushton, Societal impacts of pandemics: Comparing COVID-19 with history to focus our response, Front. Public Health, vol. 9, p. 630449, 2021.
[4]
G. G. Trindade, S. M. C. Caxito, A. R. E. O. Xavier, M. A. S. Xavier, and F. BrandÃo, COVID-19: Therapeutic approaches description and discussion, An. Acad. Bras. Ciênc., vol. 92, no. 2, p. e20200466, 2020.
[5]
S. Flaxman, S. Mishra, A. Gandy, H. J. T. Unwin, T. A. Mellan, H. Coupland, C. Whittaker, H. Zhu, T. Berah, J. W. Eaton, et al., Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe, Nature, vol. 584, no. 7820, pp. 257–261, 2020.
[6]
N. Askitas, K. Tatsiramos, and B. Verheyden, Estimating worldwide effects of non-pharmaceutical interventions on COVID-19 incidence and population mobility patterns using a multiple-event study, Sci. Rep., vol. 11, no. 1, p. 1972, 2021.
[7]
R. A. Roman, Application of spectral clustering for the detection of high priority areas of attention for COVID-19 in Mexico, in Proc. 4th EAI Int. Conf. Computer Science and Health Engineering in Health Services, Cham, Switzerland, 2020, pp. 130–142.
[8]
Z. Malki, E. S. Atlam, A. Ewis, G. Dagnew, A. R. Alzighaibi, G. ELmarhomy, M. A. Elhosseini, A. E. Hassanien, and I. Gad, ARIMA models for predicting the end of COVID-19 pandemic and the risk of second rebound, Neural Comput. Appl., vol. 33, no. 7, pp. 2929–2948, 2021.
[9]
H. Fransiska, Clustering provinces in Indonesia based on daily Covid-19 cases, in Proc. Int. Conf. Mathematics, Statistics and Data Science, Bogor, Indonesia, 2020, pp. 1–9.
[10]
M. Coccia, Preparedness of countries to face COVID-19 pandemic crisis: Strategic positioning and factors supporting effective strategies of prevention of pandemic threats, Environ. Res., vol. 203, p. 111678, 2022.
[11]
M. F. Bashir, B. J. Ma, , B. Komal, M. A. Bashir, T. H. Farooq, N. Iqbal, and M. Bashir, Correlation between environmental pollution indicators and COVID-19 pandemic: A brief study in Californian context, Environ. Res., vol. 187, p. 109652, 2020.
[12]
S. E. Haque and M. Rahman, Association between temperature, humidity, and COVID-19 outbreaks in Bangladesh, Environ. Sci. Policy, vol. 114, pp. 253–255, 2020.
[13]
N. Islam, Q. Bukhari, Y. Jameel, S. Shabnam, A. M. Erzurumluoglu, M. A. Siddique, J. M. Massaro, and R. B. D’Agostino Sr., COVID-19 and climatic factors: A global analysis, Environ. Res., vol. 193, p. 110355, 2021.
[14]
N. James and M. Menzies, Cluster-based dual evolution for multivariate time series: Analyzing COVID-19, Chaos, vol. 30, no. 6, p. 061108, 2020.
[15]
M. R. Mahmoudi, D. Baleanu, Z. Mansor, B. A. Tuan, and K. H. Pho, Fuzzy clustering method to compare the spread rate of Covid-19 in the high risks countries, Chaos Solitons Fractals, vol. 140, p. 110230, 2020.
[16]
N. Kadi and M. Khelfaoui, Population density, a factor in the spread of COVID-19 in Algeria: Statistic study, Bull. Natl. Res. Cent., vol. 44, no. 1, p. 138, 2020.
[17]
K. S. Bhangu, J. K. Sandhu, and L. Sapra, Time series analysis of COVID-19 cases, World J. Eng., vol. 19, no. 1, pp. 40–48, 2022.
[18]
S. Asfahan, A. Shahul, G. Chawla, N. Dutt, R. Niwas, and N. Gupta, Early trends of socio-economic and health indicators influencing case fatality rate of COVID-19 pandemic, Monaldi Arch. Chest. Dis., vol. 90, no. 3, p. 1388, 2020.
[19]
M. Gilbert, G. Pullano, F. Pinotti, E. Valdano, C. Poletto, P. Y. Boëlle, E. D’Ortenzio, Y. Yazdanpanah, S. P. Eholie, M. Altmann, et al., Preparedness and vulnerability of African countries against importations of COVID-19: A modelling study, Lancet, vol. 395, no. 10227, pp. 871–877, 2020.
[20]
A. Rahman, N. Zaman, A. T. Asyhari, F. Al-Turjman, Z. Alam Bhuiyan, and M. F. Zolkipli, Data-driven dynamic clustering framework for mitigating the adverse economic impact of Covid-19 lockdown practices, Sustain Cities Soc., vol. 62, p. 102372, 2020.
[21]
A. J. Jawad, Effectiveness of population density as natural social distancing in COVID-19 spreading, Ethics Med. Public Health, vol. 15, p. 100556, 2020.
[22]
R. Azizah, S. Martini, L. Sulistyorini, , A. S. Pawitra, D. Budijanto, S. S. Nagari, C. Fitrahanjani, F. H. Ramadhani, and M. T. Latif, Association between climatic conditions, population density and COVID-19 in Indonesia, Sains Malays., vol. 50, no. 3, pp. 879–887, 2021.
[23]
J. G. Xie and Y. J. Zhu, Association between ambient temperature and COVID-19 infection in 122 cities from China, Sci. Total Environ., vol. 724, p. 138201, 2020.
[24]
M. Xu, C. Cao, X. Zhang, H. Lin, Z. Yao, S. Zhong, Z. Huang, and R. Shea Duerler, Fine-scale space-time cluster detection of COVID-19 in Mainland China using retrospective analysis, Int.J. Environ. Res. Public Health, vol. 18, no. 7, p. 3583, 2021.
[25]
D. Pan, S. Sze, J. S. Minhas, M. N. Bangash, N. Pareek, P. Divall, C. M. L. Williams, M. R. Oggioni, I. B. Squire, L. B. Nellums, et al., The impact of ethnicity on clinical outcomes in COVID-19: A systematic review, eClinicalMedicine, vol. 23, p. 100404, 2020.
[26]
G. Saini, M. H. Swahn, and R. Aneja, Disentangling the coronavirus disease 2019 health disparities in African Americans: Biological, environmental, and social factors, Open Forum Infect. Dis., vol. 8, no. 3, p. ofab064, 2021.
[27]
[28]
[29]
[30]
GBD 2016 Healthcare Access and Quality Collaborators, Measuring performance on the Healthcare Access and Quality Index for 195 countries and territories and selected subnational locations: A systematic analysis from the Global Burden of Disease Study 2016, Lancet, vol. 391, no. 10136, pp. 2236–2271, 2018.
[31]
B. T. Polyak and A. B. Juditsky, Acceleration of stochastic approximation by averaging, SIAM J. Control Optim., vol. 30, no. 4, pp. 838–855, 1992.
[32]
C. Cortes and V. Vapnik, Support-vector networks, Mach. Learn., vol. 20, no. 3, pp. 273–297, 1995.
[33]
T. Cover and P. Hart, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, 1967.
[34]
H. Li, Statistical Learning Methods, (in Chinese), Beijing, China: Tsinghua University Press, 2012, pp. 67–73.
[35]
E. Scornet, Tuning parameters in random forests, ESAIM: Proc. Surv., vol. 60, pp. 144–162, 2017.
[36]
P. Geurts, D. Ernst, and L. Wehenkel, Extremely randomized trees, Mach. Learn., vol. 63, no. 1, pp. 3–42, 2006.
[37]
J. H. Friedman, Greedy function approximation: A gradient boosting machine, Ann. Statist., vol. 29, no. 5, pp. 1189–1232, 2001.
[38]
L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., vol. 9, pp. 2579–2605, 2008.
[39]
G. Hinton and S. Roweis, Stochastic neighbor embedding, Adv. Neural Inf. Process. Syst., vol. 41, no. 4, pp. 833–840, 2002.
[40]
C. Xie, S. Wu, C. Zhang, W. Sun, H. He, T. Pei, and G. Luo, Analysis of time series features of COVID-19 in various countries based on pedigree clustering, (in Chinese), J. Geo-Inf. Sci., vol. 23, no. 2, pp. 236–245, 2021.
[41]
U. von Luxburg, A tutorial on spectral clustering, Statist. Comput., vol. 17, no. 4, pp. 395–416, 2007.
[42]
H. Yin, T. Sun, L. Yao, Y. Jiao, L. Ma, L. Lin, J. C. Graff, L. Aleya, A. Postlethwaite, W. Gu, and H. Chen, Association between population density and infection rate suggests the importance of social distancing and travel restriction in reducing the COVID-19 pandemic, Environ. Sci. Pollut. Res., vol. 28, no. 30, pp. 40424–40430, 2021.
[43]
M. Coccia, The effects of atmospheric stability with low wind speed and of air pollution on the accelerated transmission dynamics of COVID-19, Int.J. Environ. Stud., vol. 78, no. 1, pp. 1–27, 2021.
[44]
M. S. Shiels, J. S. Almeida, M. García-Closas, P. S. Albert, N. D. Freedman, and A. Berrington de Gonzalez, Impact of population growth and aging on estimates of excess U.S. deaths during the COVID-19 Pandemic, March to August 2020, Ann. Intern. Med., vol. 174, no. 4, pp. 437–443, 2021.
[45]
T. Caliski and J. Harabasz, A dendrite method for cluster analysis, Commun. Statist., vol. 3, no. 1, pp. 1–27, 1974.
[46]
M. Coccia, How do low wind speeds and high levels of air pollution support the spread of COVID-19? Atmos. Pollut. Res., vol. 12, no. 1, pp. 437–445, 2021.
[47]
J. N. Nkengasong and W. Mankoula, Looming threat of COVID-19 infection in Africa: Act collectively, and fast, Lancet, vol. 395, no. 10227, pp. 841–842, 2020.
[48]
M. M. Kavanagh, N. A. Erondu, O. Tomori, V. J. Dzau, E. A. Okiro, A. Maleche, I. C. Aniebo, U. Rugege, C. B. Holmes, and L. O. Gostin, Access to lifesaving medical resources for African countries: COVID-19 testing and response, ethics, and politics, Lancet, vol. 395, no. 10238, pp. 1735–1738, 2020.
[49]
Y. Diao, S. Kodera, D. Anzai, J. Gomez-Tames, E. A. Rashed, and A. Hirata, Influence of population density, temperature, and absolute humidity on spread and decay durations of COVID-19: A comparative study of scenarios in China, England, Germany, and Japan, One Health, vol. 12, p. 100203, 2021.
[50]
M. Coccia, The relation between length of lockdown, numbers of infected people and deaths of Covid-19, and economic growth of countries: Lessons learned to cope with future pandemics similar to Covid-19 and to constrain the deterioration of economic system, Sci. Total Environ., vol. 775, p. 145801, 2021.
[51]
D. K. A. Rosario, Y. S. Mutz, P. C. Bernardes, and C. A. Conte-Junior, Relationship between COVID-19 and weather: Case study in a tropical country, Int.J. Hyg. Environ. Health, vol. 229, p. 113587, 2020.
[52]
M. Coccia, Factors determining the diffusion of COVID-19 and suggested strategy to prevent future accelerated viral infectivity similar to COVID, Sci. Total Environ., vol. 729, p. 138474, 2020.
[53]
M. Coccia, The impact of first and second wave of the COVID-19 pandemic in society: Comparative analysis to support control measures to cope with negative effects of future infectious diseases, Environ. Res., vol. 197, p. 111099, 2021.
[54]
E. A. Harrison and J. W. Wu, Vaccine confidence in the time of COVID-19, Eur.J. Epidemiol., vol. 35, no. 4, pp. 325–330, 2020.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 08 September 2021
Revised: 20 February 2022
Accepted: 18 April 2022
Published: 18 July 2022
Issue date: December 2022

Copyright

© The author(s) 2022.

Acknowledgements

This work was supported in part by the National University Student Innovation and Entrepreneurship in Training Program of China (No. 202110373044), the National Natural Science Foundation of China (No. 71801108), the Laboratory Opening Project Fund of Huaibei Normal University (No. 2021sykf041), and the Special Needs Project of Huaibei Normal University (No. 2021zlgc147).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return