Journal Home > Volume 5 , Issue 2

The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data. The capability to process these gigantic amounts of data in real-time with Big Data Analytics (BDA) tools and Machine Learning (ML) algorithms carries many paybacks. However, the high number of free BDA tools, platforms, and data mining tools makes it challenging to select the appropriate one for the right task. This paper presents a comprehensive mini-literature review of ML in BDA, using a keyword search; a total of 1512 published articles was identified. The articles were screened to 140 based on the study proposed novel taxonomy. The study outcome shows that deep neural networks (15%), support vector machines (15%), artificial neural networks (14%), decision trees (12%), and ensemble learning techniques (11%) are widely applied in BDA. The related applications fields, challenges, and most importantly the openings for future research, are detailed.


menu
Abstract
Full text
Outline
About this article

A Mini-Review of Machine Learning in Big Data Analytics: Applications, Challenges, and Prospects

Show Author's information Isaac Kofi Nti( )Juanita Ahia QuarcooJustice AningGodfred Kusi Fosu
Department of Computer Science and Informatics, University of Energy and Natural Resources, Sunyani BS2103, Ghana
Department of Electrical & Electronic Engineering, Sunyani Technical University, Sunyani BS2103, Ghana
Department of Computer Science, Sunyani Technical University, Sunyani BS2103, Ghana

Abstract

The availability of digital technology in the hands of every citizenry worldwide makes an available unprecedented massive amount of data. The capability to process these gigantic amounts of data in real-time with Big Data Analytics (BDA) tools and Machine Learning (ML) algorithms carries many paybacks. However, the high number of free BDA tools, platforms, and data mining tools makes it challenging to select the appropriate one for the right task. This paper presents a comprehensive mini-literature review of ML in BDA, using a keyword search; a total of 1512 published articles was identified. The articles were screened to 140 based on the study proposed novel taxonomy. The study outcome shows that deep neural networks (15%), support vector machines (15%), artificial neural networks (14%), decision trees (12%), and ensemble learning techniques (11%) are widely applied in BDA. The related applications fields, challenges, and most importantly the openings for future research, are detailed.

Keywords: MapReduce, Hadoop, Machine Learning (ML), Big Data Analytics (BDA), Big Data (BD)

References(140)

[1]
J. F. Qiu, Q. H. Wu, G. R. Ding, Y. H. Xu, and S. Feng, A survey of machine learning for big data processing, EURASIP J. Adv. Signal Process., vol. 2016, no. 1, pp. 67, 2016.
[2]
B. Aragona and R. De Rosa, Big data in policy making, Math. Popul. Stud., vol. 26, no. 2, pp. 107-113, 2019.
[3]
G. Kaur, P. Tomar, and P. Singh, Design of cloud-based green IoT architecture for smart cities, in Internet of Things and Big Data Analytics Toward Next-Generation Intelligence, N. Dey, A. E. Hassanien, C. Bhatt, A. S. Ashour, and S. C. Satapathy, eds. Cham, Germany: Springer, 2018, pp. 315-333.
DOI
[4]
A. Holst, Amount of information globally 2010-2024, https://www.statista.com/statistics/871513/worldwide-data-created/, 2020.
[5]
Z. H. Sun, L. Z. Sun, and K. Strang, Big data analytics services for enhancing business intelligence, J. Comput. Inf. Syst., vol. 58, no. 2, pp. 162-169, 2018.
[6]
S. Debortoli, O. Müller, and J. vom Brocke, Comparing business intelligence and big data skills, Bus. Inf. Syst. Eng., vol. 6, no. 5, pp. 289-300, 2014.
[7]
B. K. Sarkar, Big data for secure healthcare system: A conceptual design, Complex Intell. Syst., vol. 3, no. 2, pp. 133-151, 2017.
[8]
J. Zakir, T. Seymour, and K. Berg, Big data analytics, Issues Inf. Syst., vol. 16, no. 2, pp. 81-90, 2015.
[9]
I. K. Nti, A. F. Adekoya, and B. A. Weyori, A systematic review of fundamental and technical analysis of stock market predictions, Artif. Intell. Rev., vol. 53, no. 4, pp. 3007-3057, 2020.
[10]
I. K. Nti, A. F. Adekoya, and B. A. Weyori, A comprehensive evaluation of ensemble learning for stock-market prediction, J. Big Data, vol. 7, no. 1, p. 20, 2020.
[11]
R. Raja, I. Mukherjee, and B. K. Sarkar, A systematic review of healthcare big data, Sci. Program., vol. 2020, p. 5471849, 2020.
[12]
C. W. Tsai, C. F. Lai, H. C. Chao, and A. V. Vasilakos, Big data analytics: A survey, J. Big Data, vol. 2, no. 1, p. 21, 2015.
[13]
P. Russom, Introduction to big data analytics, https://vivomente.com/wp-content/uploads/2016/04/big-data-analytics-white-paper.pdf, 2011.
[14]
M. G. Kibria, K. Nguyen, G. P. Villardi, O. Zhao, K. Ishizu, and F. Kojima, Big data analytics, machine learning, and artificial intelligence in next-generation wireless networks, IEEE Access, vol. 6, pp. 32328-32338, 2018.
[15]
C. Choi, J. Kim, J. Kim, D. Kim, Y. Bae, and H. S. Kim, Development of heavy rain damage prediction model using machine learning based on big data, Adv. Meteorol., vol. 2018, p. 5024930, 2018.
[16]
T. T. Le, W. X. Fu, and J. H. Moore, Scaling tree-based automated machine learning to biomedical big data with a feature set selector, Bioinformatics, vol. 36, no. 1, pp. 250-256, 2020.
[17]
K. Y. Ngiam and I. W. Khor, Big data and machine learning algorithms for health-care delivery, Lancet Oncol., vol. 20, no. 5, pp. e262-e273, 2019.
[18]
F. Wang, M. G. Li, Y. D. Mei, and W. R. Li, Time series data mining: A case study with big data analytics approach, IEEE Access, vol. 8, pp. 14322-14328, 2020.
[19]
W. Raghupathi and V. Raghupathi, Big data analytics in healthcare: Promise and potential, Health Inf. Sci. Syst., vol. 2, p. 3, 2014.
[20]
I. K. Nti, A. F. Adekoya, and B. A. Weyori, Random forest based feature selection of macroeconomic variables for stock market prediction, Am. J. Appl. Sci., vol. 16, no. 7, pp. 200-212, 2019.
[21]
A. F. Adekoya and K. I. Nti, The COVID-19 outbreak and effects on major stock market indices across the globe: A machine learning approach, Indian J. Sci. Technol., vol. 13, no. 35, pp. 3695-3706, 2020.
[22]
I. K. Nti, A. F. Adekoya, and B. A. Weyori, Predicting stock market price movement using sentiment analysis: Evidence from Ghana, Appl. Comput. Syst., vol. 25, no. 1, pp. 33-42, 2020.
[23]
I. K. Nti, A. F. Adekoya, and B. A. Weyori, Efficient stock-market prediction using ensemble support vector machine, Open Comput. Sci., vol. 10, no. 1, pp. 153-163, 2020.
[24]
I. K. Nti, M. Teimeh, A. F. Adekoya, and O. Nyarko-boateng, Forecasting electricity consumption of residential users based on lifestyle data using artificial neural networks, ICTACT J. Soft Comput., vol. 10, no. 3, pp. 2107-2116, 2020.
[25]
I. K. Nti, A. Y. Appiah, and O. Nyarko-Boateng, Assessment and prediction of earthing resistance in domestic installation, Eng. Rep., vol. 2, no. 1, p. e12090, 2020.
[26]
I. K. Nti, A. A. Samuel, and A. Michael, Predicting monthly electricity demand using soft-computing technique, Int. Res. J. Eng. Technol., vol. 6, no. 6, pp. 1967-1973, 2019.
[27]
I. K. Nti, A. F. Adakoya, and O. Nyarko-Boateng, A multifactor authentication framework for the national health insurance scheme in ghana using machine learning, Am. J. Eng. Appl. Sci., vol. 13, no. 4, pp. 639-648, 2020.
[28]
S. Akyeramfo-Sam, A. A. Philip, D. Yeboah, N. C. Nartey, and I. K. Nti, A web-based skin disease diagnosis using convolutional neural networks, Int. J. Inf. Technol. Comput. Sci., vol. 11, no. 11, pp. 54-60, 2019.
[29]
D. P. Kavadi, R. Patan, M. Ramachandran, and A. H. Gandomi, Partial derivative nonlinear global pandemic machine learning prediction of COVID-19, Chaos, Solitons Fractals, vol. 139, p. 110056, 2020.
[30]
I. K. Nti, A. F. Adekoya, M. Opoku, and P. Nimbe, Synchronising social media into teaching and learning settings at tertiary education, Int. J. Soc. Media Interact. Learn. Environ., vol. 6, no. 3, pp. 230-243, 2020.
[31]
I. K. Nti and J. A. Quarcoo, Self-motivation and academic performance in computer programming language using a hybridised machine learning technique, Int. J. Artif. Intell. Expert Syst., vol. 8, no. 2, pp. 12-30, 2019.
[32]
R. Ghorbani and R. Ghousi, Comparing different resampling methods in predicting students’ performance using machine learning techniques, IEEE Access, vol. 8, pp. 67899-67911, 2020.
[33]
K. T. Chui, D. C. L. Fung, M. D. Lytras, and T. M. Lam, Predicting at-risk university students in a virtual learning environment via a machine learning algorithm, Comput. Human Behav., vol. 107, p. 105584, 2020.
[34]
I. K. Nti, G. Eric, and Y. S. Jonas, Detection of plant leaf disease employing image processing and gaussian smoothing approach, Int. J. Comput. Appl., vol. 162, no. 2, pp. 20-25, 2017.
[35]
A. Sharifi, Yield prediction with machine learning algorithms and satellite images, J. Sci. Food Agric., vol. 101, no. 3, pp. 891-896, 2021.
[36]
A. Hamrani, A. Akbarzadeh, and C. A. Madramootoo, Machine learning for predicting greenhouse gas emissions from agricultural soils, Sci. Total Environ., vol. 741, p. 140338, 2020.
[37]
A. Boukerche and J. H. Wang, Machine learning-based traffic prediction models for intelligent transportation systems, Comput. Netw., vol. 181, p. 107530, 2020.
[38]
P. P. Hanzelik, S. Gergely, C. Gáspár, and L. Györy, Machine learning methods to predict solubilities of rock samples, J. Chemom., vol. 34, no. 2, p. e3198, 2020.
[39]
A. L. Beam and I. S. Kohane, Big data and machine learning in health care, JAMA, vol. 319, no. 13, pp. 1317-1318, 2018.
[40]
J. E. Bibault, P. Giraud, and A. Burgun, Big Data and machine learning in radiation oncology: State of the art and future prospects, Cancer Lett., vol. 382, no. 1, pp. 110-117, 2016.
[41]
S. Siuly and Y. C. Zhang, Medical big data: Neurological diseases diagnosis through medical data analysis, Data Sci. Eng., vol. 1, no. 2, pp. 54-64, 2016.
[42]
A. K. U. Haq, A. Khattak, N. Jamil, M. A. Naeem, and F. Mirza, Data analytics in mental healthcare, Sci. Program., vol. 2020, p. 2024160, 2020.
[43]
G. R. Chen and M. Islam, Big data analytics in healthcare, in Proc. 2nd Int. Conf. Safety Produce Informatization, Chongqing, China, 2019, pp. 227-230.
[44]
Z. F. Khan and S. R. Alotaibi, Applications of artificial intelligence and big data analytics in m-health: A healthcare system perspective, J. Healthc. Eng., vol. 2020, p. 8894694, 2020.
[45]
Z. He, C. Tao, J. Bian, M. Dumontier, and W. R. Hogan, Semantics-powered healthcare engineering and data analytics, J. Healthc. Eng., vol. 2017, p. 7983473, 2017.
[46]
G. K. Kang, J. Z. Gao, S. Chiao, S. Q. Lu, and G. Xie, Air quality prediction: Big data and machine learning approaches, Int. J. Environ. Sci. Dev., vol. 9, no. 1, pp. 8-16, 2018.
[47]
M. Mohammadi, A. Al-Fuqaha, S. Sorour, and M. Guizani, Deep learning for IoT big data and streaming analytics: A survey, IEEE Commun. Surv. Tutorials, vol. 20, no. 4, pp. 2923-2960, 2018.
[48]
Z. M. Bi and D. Cochran, Big data analytics with applications, J. Manag. Anal., vol. 1, no. 4, pp. 249-265, 2014.
[49]
S. Choudhury, Q. Ye, M. X. Dong, and Q. C. Zhang, IoT big data analytics, Wirel. Commun. Mob. Comput., vol. 2019, p. 9245392, 2019.
[50]
C. Ma, H. H. Zhang, and X. F. Wang, Machine learning for Big Data analytics in plants, Trends Plant Sci., vol. 19, no. 12, pp. 798-808, 2014.
[51]
K. Szczypiorski, L. Q. Wang, X. Y. Luo, and D. P. Ye, Big data analytics for information security, Secur. Commun. Netw., vol. 2018, p. 7657891, 2018.
[52]
A. L’Heureux, K. Grolinger, H. F. Elyamany, and M. A. M. Capretz, Machine learning with big data: Challenges and approaches, IEEE Access, vol. 5, pp. 7776-7797, 2017.
[53]
L. N. Zhou, S. M. Pan, J. W. Wang, and A. V. Vasilakos, Machine learning on big data: Opportunities and challenges, Neurocomputing, vol. 237, pp. 350-361, 2017.
[54]
S. LaValle, E. Lesser, R. Shockley, M. S. Hopkins, and N. Kruschwitz, Big data, analytics and the path from insights to value, MIT Sloan Manag. Rev., vol. 52, no. 2, pp. 21-31, 2011.
[55]
D. Fisher, R. DeLine, M. Czerwinski, and S. Drucker, Interactions with big data analytics, Interactions, vol. 19, no. 3, pp. 50-59, 2012.
[56]
H. Chen, R. H. L. Chiang, and V. C. Storey, Business intelligence and analytics: From big data to big impact, MIS Q., vol. 36, no. 4, pp. 1165-1188, 2012.
[57]
J. Ram, C. Y. Zhang, and A. Koronios, The implications of big data analytics on business intelligence: A qualitative study in China, Procedia Comput. Sci., vol. 87, pp. 221-226, 2016.
[58]
T. Condie, P. Mineiro, N. Polyzotis, and M. Weimer, Machine learning on Big Data, in Proc. 29th Int. Conf. Data Engineering, Brisbane, Australia, 2013, pp. 1242-1244.
[59]
B. Wixom, T. Ariyachandra, D. Douglas, M. Goul, B. Gupta, L. Iyer, U. Kulkarni, J. G. Mooney, G. Phillips-Wren, and O. Turetken, The current state of business intelligence in academia: The arrival of big data, Commun. Assoc. Inf. Syst., vol. 34, p. 1, 2014.
[60]
K. Kambatla, G. Kollias, V. Kumar, and A. Grama, Trends in big data analytics, J. Parallel Distrib. Comput., vol. 74, no. 7, pp. 2561-2573, 2014.
[61]
Z. Obermeyer and E. J. Emanuel, Predicting the future — big data, machine learning, and clinical medicine, N. Engl. J. Med., vol. 375, no. 13, pp. 1216-1219, 2016.
[62]
B. Ale, Risk analysis and big data, Saf. Reliab., vol. 36, no. 3, pp. 153-165, 2016.
[63]
O. Y. Al-Jarrah, P. D. Yoo, S. Muhaidat, G. K. Karagiannidis, and K. Taha, Efficient machine learning for big data: A review, Big Data Res., vol. 2, no. 3, pp. 87-93, 2015.
[64]
D. Z. Chong and H. Shi, Big data analytics: A literature review, J. Manag. Anal., vol. 2, no. 3, pp. 175-201, 2015.
[65]
M. Fathi, M. H. Kashani, S. M. Jameii, and E. Mahdipour, Big data analytics in weather forecasting: A systematic review, Arch. Comput. Methods Eng., .
[66]
H. Hassani and E. S. Silva, Forecasting with big data: A review, Ann. Data Sci., vol. 2, no. 1, pp. 5-19, 2015.
[67]
L. Collins, Mini literature review: A new type of literature review article, https://www.emeraldgrouppublishing.com/archived/products/journals/call_for_papers.htm%3Fid%3D5730, 2021.
[68]
J. C. Elfar, Introduction to mini-review, Geriatr. Orthop. Surg. Rehabil., vol. 5, no. 2, p. 36, 2014.
[69]
X. X. Yin and X. W. Zhao, Big data driven multi-objective predictions for offshore wind farm based on machine learning algorithms, Energy, vol. 186, p. 115704, 2019.
[70]
M. Chen, Y. X. Hao, K. Hwang, L. Wang, and L. Wang, Disease prediction by machine learning over big data from healthcare communities, IEEE Access, vol. 5, pp. 8869-8879, 2017.
[71]
A. Y. L. Chong, E. Ch’ng, M. J. Liu, and B. Y. Li, Predicting consumer product demands via Big Data: The roles of online promotional marketing and online reviews, Int. J. Prod. Res., vol. 55, no. 17, pp. 5142-5156, 2017.
[72]
K. A. Jallad, M. Aljnidi, and M. S. Desouki, Anomaly detection optimization using big data and deep learning to reduce false-positive, J. Big Data, vol. 7, no. 1, p. 68, 2020.
[73]
K. M. Paramkusem and R. S. Aygun, Classifying categories of SCADA attacks in a big data framework, Ann. Data Sci., vol. 5, no. 3, pp. 359-386, 2018.
[74]
A. Wibisono, P. Mursanto, J. Adibah, W. D. W. T. Bayu, M. I. Rizki, L. M. Hasani, and V. F. Ahli, Distance variable improvement of time-series big data stream evaluation, J. Big Data, vol. 7, no. 1, p. 85, 2020.
[75]
Y. Wang, Y. Li, M. M. Xiong, Y. Y. Shugart, and L. Jin, Random bits regression: A strong general predictor for big data, Big Data Anal., vol. 1, p. 12, 2016.
[76]
E. Tromp, M. Pechenizkiy, and M. M. Gaber, Expressive modeling for trusted big data analytics: Techniques and applications in sentiment analysis, Big Data Anal., vol. 2, no. 1, p. 5, 2017.
[77]
R. G. Zuo and Y. H. Xiong, Big data analytics of identifying geochemical anomalies supported by machine learning methods, Nat. Resour. Res., vol. 27, no. 1, pp. 5-13, 2018.
[78]
M. Du, K. Wang, Z. Q. Xia, and Y. Zhang, Differential privacy preserving of training model in wireless big data with edge computing, IEEE Trans. Big Data, vol. 6, no. 2, pp. 283-295, 2020.
[79]
A. S. Elsayad, A. I. El Desouky, M. M. Salem, and M. Badawy, A deep learning H2O framework for emergency prediction in biomedical big data, IEEE Access, vol. 8, pp. 97231-97242, 2020.
[80]
J. Huang, C. X. Wang, L. Bai, J. Sun, Y. Yang, J. Li, O. Tirkkonen, and M. T. Zhou, A big data enabled channel model for 5G wireless communication systems, IEEE Trans. Big Data, vol. 6, no. 2, pp. 211-222, 2020.
[81]
D. Jo, B. Yu, H. Jeon, and K. Sohn, Image-to-image learning to predict traffic speeds by considering area-wide spatio-temporal dependencies, IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1188-1197, 2019.
[82]
W. Yuan, P. Deng, T. Taleb, J. F. Wan, and C. F. Bi, An unlicensed taxi identification model based on big data analysis, IEEE Trans. Intell. Transp. Syst., vol. 17, no. 6, pp. 1703-1713, 2016.
[83]
S. Puttinaovarat and P. Horkaew, Flood forecasting system based on integrated big and crowdsource data by using machine learning techniques, IEEE Access, vol. 8, pp. 5885-5905, 2020.
[84]
C. T. Zhang, H. X. Zhang, J. P. Qiao, D. F. Yuan, and M. G. Zhang, Deep transfer learning for intelligent cellular traffic prediction based on cross-domain big data, IEEE J. Sel. Areas Commun., vol. 37, no. 6, pp. 1389-1401, 2019.
[85]
S. Alghunaim and H. H. Al-Baity, On the scalability of machine-learning algorithms for breast cancer prediction in big data context, IEEE Access, vol. 7, pp. 91535-91546, 2019.
[86]
D. Singh, D. Roy, and C. K. Mohan, DiP-SVM: Distribution preserving kernel support vector machine for big data, IEEE Trans. Big Data, vol. 3, no. 1, pp. 79-90, 2017.
[87]
A. N. M. B. Rashid, M. Ahmed, L. F. Sikos, and P. Haskell-Dowland, A novel penalty-based wrapper objective function for feature selection in big data using cooperative co-evolution, IEEE Access, vol. 8, pp. 150113-150129, 2020.
[88]
S. B. Roy, M. Maria, T. N. Wang, A. Ehlers, and D. Flum, Predicting adverse events after surgery, Big Data Res., vol. 13, pp. 29-37, 2018.
[89]
S. Khalifa, P. Martin, and R. Young, Label-aware distributed ensemble learning: A simplified distributed classifier training model for big data, Big Data Res., vol. 15, pp. 1-11, 2019.
[90]
M. N. Rahman, A. Esmailpour, and J. H. Zhao, Machine learning with big data an efficient electricity generation forecasting system, Big Data Res., vol. 5, pp. 9-15, 2016.
[91]
L. Oneto, E. Fumeo, G. Clerico, R. Canepa, F. Papa, C. Dambra, N. Mazzino, and D. Anguita, Train delay prediction systems: A big data analytics perspective, Big Data Res., vol. 11, pp. 54-64, 2018.
[92]
P. Genevès, T. Calmant, N. Layaïda, M. Lepelley, S. Artemova, and J. L. Bosson, Scalable machine learning for predicting at-risk profiles upon hospital admission, Big Data Res., vol. 12, pp. 23-34, 2018.
[93]
F. Celli, F. Cumbo, and E. Weitschek, Classification of large DNA methylation datasets for identifying cancer drivers, Big Data Res., vol. 13, pp. 21-28, 2018.
[94]
D. Triantafyllidou, P. Nousi, and A. Tefas, Fast deep convolutional face detection in the wild exploiting hard sample mining, Big Data Res., vol. 11, pp. 65-76, 2018.
[95]
A. W. Niu, B. Q. Cai, and S. S. Cai, Big data analytics for complex credit risk assessment of network lending based on SMOTE algorithm, Complexity, vol. 2020, p. 8563030, 2020.
[96]
A. Khan, M. A. Gul, M. I. Uddin, S. A. A. Shah, S. Ahmad, M. D. Al Firdausi, and M. Zaindin, Summarizing onlinemovie reviews: A machine learning approach to big data analytics, Sci. Program., vol. 2020, p. 5812715, 2020.
[97]
H. Q. Wu, M. M. Liu, S. B. Zhang, Z. K. Wang, and S. L. Cheng, Big data management and analytics in scientific programming: A deep learning-based method for aspect category classification of question-answering-style reviews, Sci. Program., vol. 2020, p. 4690974, 2020.
[98]
A. Khan, I. Ibrahim, M. I. Uddin, M. Zubair, S. Ahmad, M. D. Al Firdausi, and M. Zaindin, Machine learning approach for answer detection in discussion forums: An application of big data analytics, Sci. Program., vol. 2020, p. 4621196, 2020.
[99]
B. N. Silva, M. Khan, and K. Han, Big data analytics embedded smart city architecture for performance enhancement through real-time data processing and decision-making, Wirel. Commun. Mob. Comput., vol. 2017, p. 9429676, 2017.
[100]
W. Gu, K. Foster, J. Shang, and L. R. Wei, A game-predicting expert system using big data and machine learning, Expert Syst. Appl., vol. 130, pp. 293-305, 2019.
[101]
T. Daghistani, H. AlGhamdi, R. Alshammari, and R. H. AlHazme, Predictors of outpatients’ no-show: Big data analytics using apache spark, J. Big Data, vol. 7, p. 108, 2020.
[102]
T. Nibareke and J. Laassiri, Using Big Data-machine learning models for diabetes prediction and flight delays analytics, J. Big Data, vol. 7, p. 78, 2020.
[103]
N. Ahmed, A. L. C. Barczak, T. Susnjak, and M. A. Rashid, A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench, J. Big Data, vol. 7, no. 1, p. 110, 2020.
[104]
F. Saeed, Towards quantifying psychiatric diagnosis using machine learning algorithms and big fMRI data, Big Data Anal., vol. 3, no. 1, p. 7, 2018.
[105]
N. Bharill, A. Tiwari, and A. Malviya, Fuzzy based scalable clustering algorithms for handling big data using apache spark, IEEE Trans. Big Data, vol. 2, no. 4, pp. 339-352, 2016.
[106]
K. P. Zhu, G. C. Li, and Y. Zhang, Big data oriented smart tool condition monitoring system, IEEE Trans. Ind. Inform., vol. 16, no. 6, pp. 4007-4016, 2020.
[107]
I. Kotenko, I. Saenko, and A. Branitskiy, Framework for mobile internet of things security monitoring based on big data processing and machine learning, IEEE Access, vol. 6, pp. 72714-72723, 2018.
[108]
W. Zhong, N. Yu, and C. Y. Ai, Applying big data based deep learning system to intrusion detection, Big Data Mining and Analytics, vol. 3, no. 3, pp. 181-195, 2020.
[109]
A. Bousdekis, N. Papageorgiou, B. Magoutas, D. Apostolou, and G. Mentzas, Sensor-driven learning of time-dependent parameters for prescriptive analytics, IEEE Access, vol. 8, pp. 92383-92392, 2020.
[110]
B. Cleland, J. Wallace, R. Bond, M. Black, M. Mulvenna, D. Rankin, and A. Tanney, Insights into antidepressant prescribing using open health data, Big Data Res., vol. 12, pp. 41-48, 2018.
[111]
M. Giacalone, C. Cusatelli, and V. Santarcangelo, Big datacompliance for innovative clinical models, Big Data Res., vol. 12, pp. 35-40, 2018.
[112]
D. Chrimes and H. Zamani, Using distributed data over HBase in big data analytics platform for clinical services, Comput. Math. Methods Med., vol. 2017, p. 6120820, 2017.
[113]
L. Gu and H. Li, Memory or time: Performance evaluation for iterative operation on hadoop and spark, in Proc. 10th Int. Conf. High Performance Computing and Communications & 2013 IEEE Int. Conf. Embedded and Ubiquitous Computing, Zhangjiajie, China, 2013, pp. 721-727.
[114]
Y. Samadi, M. Zbakh, and C. Tadonki, Comparative study between Hadoop and Spark based on Hibench benchmarks, in Proc. 2nd Int. Conf. Cloud Computing Technologies and Applications, Marrakech, Morocco, 2016, pp. 267-275.
[115]
Y. Samadi, M. Zbakh, and C. Tadonki, Performance comparison between Hadoop and Spark frameworks using HiBench benchmarks, Concurr. Comput.: Pract. Exp. vol. 30, no. 12, p. e4367, 2018.
[116]
Q. C. Zhang, L. T. Yang, and Z. K. Chen, Deep computation model for unsupervised feature learning on big data, IEEE Trans. Serv. Comput., vol. 9, no. 1, pp. 161-171, 2016.
[117]
D. Nallaperuma, R. Nawaratne, T. Bandaragoda, A. Adikari, S. Nguyen, T. Kempitiya, D. De Silva, D. Alahakoon, and D. Pothuhera, Online incremental machine learning platform for big data-driven smart traffic management, IEEE Trans. Intell. Transp. Syst., vol. 20, no. 12, pp. 4679-4690, 2019.
[118]
J. C. Kim and K. Chung, Hybrid multi-modal deep learning using collaborative concat layer in health bigdata, IEEE Access, vol. 8, pp. 192469-192480, 2020.
[119]
G. M. Xian, Parallel machine learning algorithm using fine-grained-mode spark on a mesos big data cloud computing software framework for mobile robotic intelligent fault recognition, IEEE Access, vol. 8, pp. 131885-131900, 2020.
[120]
M. Y. Li, Z. Q. Liu, X. H. Shi, and H. Jin, ATCS: Auto-tuning configurations of big data frameworks based on generative adversarial nets, IEEE Access, vol. 8, pp. 50485-50496, 2020.
[121]
A. Fonseca and B. Cabral, Prototyping a GPGPU neural network for deep-learning big data analysis, Big Data Res., vol. 8, pp. 50-56, 2017.
[122]
A. N. M. B. Rashid, M. Ahmed, L. F. Sikos, and P. Haskell- Dowland, Cooperative co-evolution for feature selection in Big Data with random feature grouping, J. Big Data, vol. 7, no. 1, p. 107, 2020.
[123]
S. Srivastava, Top 10 countries & regions leading the big data adoption in 2019, https://www.analyticsinsight.net/top-10-countries-regions-leading-the-big-data-adop-tion-in-2019/, 2020.
[124]
H. M. Rai and K. Chatterjee, A novel adaptive feature extraction for detection of cardiac arrhythmias using hybrid technique MRDWT & MPNN classifier from ECG big data, Big Data Res., vol. 12, pp. 13-22, 2018.
[125]
Y. B. Wang, M. M. Wang, and W. Xu, A sentiment- enhanced hybrid recommender system for movie recommendation: A big data analytics framework, Wirel. Commun. Mob. Comput., vol. 2018, p. 8263704, 2018.
[126]
R. J. Dalton, The potential of big data for the cross- national study of political behavior, Int. J. Sociol., vol. 46, no. 1, pp. 8-20, 2016.
[127]
Y. He, F. R. Yu, N. Zhao, H. X. Yin, H. P. Yao, and R. C. Qiu, Big data analytics in mobile cellular networks, IEEE Access, vol. 4, pp. 1985-1996, 2016.
[128]
M. Khan, Z. W. Huang, M. Z. Li, G. A. Taylor, P. M. Ashton, and M. Khan, Optimizing hadoop performance for big data analytics in smart grid, Math. Probl. Eng., vol. 2017, p. 2198262, 2017.
[129]
M. Shahbaz, C. Y. Gao, L. L. Zhai, F. Shahzad, and M. R. Arshad, Moderating effects of gender and resistance to change on the adoption of big data analytics in healthcare, Complexity, vol. 2020, p. 2173765, 2020.
[130]
G. Gui, F. Liu, J. L. Sun, J. Yang, Z. Q. Zhou, and D. X. Zhao, Flight delay prediction based on aviation big data and machine learning, IEEE Trans. Veh. Technol., vol. 69, no. 1, pp. 140-150, 2020.
[131]
K. T. Chui, R. W. Liu, M. D. Lytras, and M. B. Zhao, Big data and IoT solution for patient behaviour monitoring, Behav. Inf. Technol., vol. 38, no. 9, pp. 940-949, 2019.
[132]
S. J. F. Ren, S. F. Wamba, S. Akter, R. Dubey, and S. J. Childe, Modelling quality dynamics, business value and firm performance in a big data analytics environment, Int. J. Prod. Res., vol. 55, no. 17, pp. 5011-5026, 2017.
[133]
F. Padillo, J. M. Luna, and S. Ventura, Evaluating associative classification algorithms for Big Data, Big Data Anal., vol. 4, no. 1, p. 2, 2019.
[134]
A. R. Rao and D. Clarke, Exploring relationships between medical college rankings and performance with big data, Big Data Anal., vol. 4, no. 1, p. 3, 2019.
[135]
D. Patel, D. Shah, and M. Shah, The intertwine of brain and body: A quantitative analysis on how big data influences the system of sports, Ann. Data Sci., vol. 7, pp. 1-16, 2020.
[136]
Z. Q. Wang, J. C. Xin, H. X. Yang, S. Tian, G. Yu, C. R. Xu, and Y. D. Yao, Distributed and weighted extreme learning machine for imbalanced big data learning, Tsinghua Science and Technology, vol. 22, no. 2, pp. 160-173, 2017.
[137]
G. L. Zhang, J. Sun, L. Chitkushev, and V. Brusic, Big data analytics in immunology: A knowledge-based approach, Biomed Res. Int., vol. 2014, p. 437987, 2014.
[138]
W. R. Li, M. G. Li, Y. D. Mei, T. Li, and F. Wang, A big data analytics approach for dynamic feedback warning for complex systems, Complexity, vol. 2020, p. 7652496, 2020.
[139]
B. R. Chang, Y. D. Lee, and P. H. Liao, Development of multiple big data analytics platforms with rapid response, Sci. Program., vol. 2017, p. 6972461, 2017.
[140]
A. K. Ju, Y. B. Guo, Z. W. Ye, T. Li, and J. Ma, HeteMSD: A big data analytics framework for targeted cyber-attacks detection using heterogeneous multisource data, Secur. Commun. Netw., vol. 2019, p. 5483918, 2019.
Publication history
Copyright
Rights and permissions

Publication history

Received: 14 October 2021
Revised: 11 December 2021
Accepted: 13 December 2021
Published: 25 January 2022
Issue date: June 2022

Copyright

© The author(s) 2022.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return