Journal Home > Volume 4 , Issue 2

A novel coronavirus (SARS-CoV-2) is an unusual viral pneumonia in patients, first found in late December 2019, latter it declared a pandemic by World Health Organizations because of its fatal effects on public health. In this present, cases of COVID-19 pandemic are exponentially increasing day by day in the whole world. Here, we are detecting the COVID-19 cases, i.e., confirmed, death, and cured cases in India only. We are performing this analysis based on the cases occurring in different states of India in chronological dates. Our dataset contains multiple classes so we are performing multi-class classification. On this dataset, first, we performed data cleansing and feature selection, then performed forecasting of all classes using random forest, linear model, support vector machine, decision tree, and neural network, where random forest model outperformed the others, therefore, the random forest is used for prediction and analysis of all the results. The K-fold cross-validation is performed to measure the consistency of the model.


menu
Abstract
Full text
Outline
About this article

Prediction of COVID-19 Confirmed, Death, and Cured Cases in India Using Random Forest Model

Show Author's information Vishan Kumar Gupta( )Avdhesh GuptaDinesh KumarAnjali Sardana
Department of Computer Science and Engineering (CSE), Graphic Era Deemed to be University, Dehradun 248002, India
Department of CSE, IMS Engineering College, Ghaziabad 201009, India
Department of CSE, KIET Group of Institutions, Ghaziabad 201206, India

Abstract

A novel coronavirus (SARS-CoV-2) is an unusual viral pneumonia in patients, first found in late December 2019, latter it declared a pandemic by World Health Organizations because of its fatal effects on public health. In this present, cases of COVID-19 pandemic are exponentially increasing day by day in the whole world. Here, we are detecting the COVID-19 cases, i.e., confirmed, death, and cured cases in India only. We are performing this analysis based on the cases occurring in different states of India in chronological dates. Our dataset contains multiple classes so we are performing multi-class classification. On this dataset, first, we performed data cleansing and feature selection, then performed forecasting of all classes using random forest, linear model, support vector machine, decision tree, and neural network, where random forest model outperformed the others, therefore, the random forest is used for prediction and analysis of all the results. The K-fold cross-validation is performed to measure the consistency of the model.

Keywords: COVID-19, random forest, coronavirus, respiratory tract, multi-class classification

References(15)

[1]
Y. Chen, Q. Liu, and D. Guo, Emerging coronaviruses: Genome structure, replication, and pathogenesis, Journal of Medical Virology, vol. 92, no. 4, pp. 418-423, 2020.
[2]
[3]
[4]
M. Cascella, M. Rajnik, A. Cuomo, S. C. Dulebohn, and R. D. Napoli, Features, Evaluation and Treatment Coronavirus (COVID-19). Treasure Island, FL, USA: StatPearls Publishing, 2020.
[5]
Kaggle dataset for COVID-19 in India, https://www.kaggle.com/sudalairajkumar/covid19-in-India, 2020.
[6]
V. K. Gupta and P. S. Rana, Ensemble technique for toxicity prediction of small drug molecules of the antioxidant response element signalling pathway, The Computer Journal, .
[7]
J. Han, J. Pei, and M. Kamber, Data mining: Concepts and techniques, Data Mining Concepts Models Methods & Algorithms Second Edition, vol. 5, no. 4, pp. 1-18, 2006.
[8]
rpart-the r package for decision tree, https://cran.rproject.org/web/packages/rpart/rpart.pdf, 2020.
[9]
[10]
randomforest-the r package for statistical computing, https://cran.rproject.org/web/packages/randomforest/randomforest.pdf, 2017.
[11]
nnet-the r package for neural network, https://cran.rproject.org/web/packages/nnet/nnet.pdf, 2017.
[12]
e1071-the R package for statistical computing, https://cran.rproject.org/web/packages/e1071/e1071.pdf, 2019.
[13]
V. K. Gupta and P. S. Rana, Activity assessment of small drug molecules in estrogen receptor using multilevel prediction model, IET Systems Biology, vol. 13, no. 3, pp. 147-158, 2019.
[14]
V. K. Gupta and P. S. Rana, Toxicity prediction of small drug molecules of androgen receptor using multilevel ensemble model, Journal of Bioinformatics and Computational Biology, vol. 17, no. 5, pp. 1-26, 2019.
[15]
V. K. Gupta and P. S. Rana, Toxicity prediction of small drug molecules of aryl hydrocarbon receptor using a proposed ensemble model, Turkish Journal of Electrical Engineering & Computer Sciences, vol. 24, no. 4, pp. 2833-2849, 2019.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 17 June 2020
Revised: 10 August 2020
Accepted: 21 August 2020
Published: 01 February 2021
Issue date: June 2021

Copyright

© The author(s) 2021

Acknowledgements

We are very much thankful to the Indian Ministry of Health and Family Welfare (MoHFW) for making the data available to the general public. Thanks to covid19india.org for providing the individual states level details to the general public. We are also thankful for Kaggle and the worldometer website, which provide huge data in date-wise to perform data analytics.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return