Scholar - SciOpen

The correct diagnosis of heart disease can save lives, while the incorrect diagnosis can be lethal. The UCI machine learning heart disease dataset compares the results and analyses of various machine learning approaches, including deep learning. We used a dataset with 13 primary characteristics to carry out the research. Support vector machine and logistic regression algorithms are used to process the datasets, and the latter displays the highest accuracy in predicting coronary disease. Python programming is used to process the datasets. Multiple research initiatives have used machine learning to speed up the healthcare sector. We also used conventional machine learning approaches in our investigation to uncover the links between the numerous features available in the dataset and then used them effectively in anticipation of heart infection risks. Using the accuracy and confusion matrix has resulted in some favorable outcomes. To get the best results, the dataset contains certain unnecessary features that are dealt with using isolation logistic regression and Support Vector Machine (SVM) classification.

Open Access Issue

Diagnosis and Detection of Alzheimer’s Disease Using Learning Algorithm

Gargi Pant Shukla, Santosh Kumar, Saroj Kumar Pandey, Rohit Agarwal, Neeraj Varshney, Ankit Kumar

Big Data Mining and Analytics 2023, 6 (4): 504-512

Published: 29 August 2023

Abstract

PDF (3 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：97

In Computer-Aided Detection (CAD) brain disease classification is a vital issue. Alzheimer’s Disease (AD) and brain tumors are the primary reasons of death. The studies of these diseases are carried out by Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and Computed Tomography (CT) scans which require expertise to understand the modality. The disease is the most prevalent in the elderly and can be fatal in its later stages. The result can be determined by calculating the mini-mental state exam score, following which the MRI scan of the brain is successful. Apart from that, various classification algorithms, such as machine learning and deep learning, are useful for diagnosing MRI scans. However, they do have some limitations in terms of accuracy. This paper proposes some insightful pre-processing methods that significantly improve the classification performance of these MRI images. Additionally, it reduced the time it took to train the model of various pre-existing learning algorithms. A dataset was obtained from Alzheimer’s Disease Neurological Initiative (ADNI) and converted from a 4D format to a 2D format. Selective clipping, grayscale image conversion, and histogram equalization techniques were used to pre-process the images. After pre-processing, we proposed three learning algorithms for AD classification, that is random forest, XGBoost, and Convolution Neural Networks (CNN). Results are computed on dataset and show that it outperformed with exiting work in terms of accuracy is 97.57% and sensitivity is 97.60%.

Open Access Issue

Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data

Ankit Kumar, Neeraj Varshney, Surbhi Bhatiya, Kamred Udham Singh

Big Data Mining and Analytics 2023, 6 (4): 465-477

Published: 29 August 2023

Abstract

PDF (3.2 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：38

We live in an age where everything around us is being created. Data generation rates are so scary, creating pressure to implement costly and straightforward data storage and recovery processes. MapReduce model functionality is used for creating a cluster parallel, distributed algorithm, and large datasets. The MapReduce strategy from Hadoop helps develop a community of non-commercial use to offer a new algorithm for resolving such problems for commercial applications as expected from this working algorithm with insights as a result of disproportionate or discriminatory Hadoop cluster results. Expected results are obtained in the work and the exam conducted under this job; many of them are scheduled to set schedules, match matrices’ data positions, clustering before determining to click, and accurate mapping and internal reliability to be closed together to avoid running and execution times. Mapper output and proponents have been implemented, and the map has been used to reduce the function. The execution input key/value pair and output key/value pair have been set. This paper focuses on evaluating this technique for the efficient retrieval of large volumes of data. The technique allows for capabilities to inform a massive database of information, from storage and indexing techniques to the distribution of queries, scalability, and performance in heterogeneous environments. The results show that the proposed work reduces the data processing time by 30%.

total 3