Journal Home > Volume 29 , Issue 1

Air pollution is a severe environmental problem in urban areas. Accurate air quality prediction can help governments and individuals make proper decisions to cope with potential air pollution. As a classic time series forecasting model, the AutoRegressive Integrated Moving Average (ARIMA) has been widely adopted in air quality prediction. However, because of the volatility of air quality and the lack of additional context information, i.e., the spatial relationships among monitor stations, traditional ARIMA models suffer from unstable prediction performance. Though some deep networks can achieve higher accuracy, a mass of training data, heavy computing, and time cost are required. In this paper, we propose a hybrid model to simultaneously predict seven air pollution indicators from multiple monitoring stations. The proposed model consists of three components: (1) an extended ARIMA to predict matrix series of multiple air quality indicators from several adjacent monitoring stations; (2) the Empirical Mode Decomposition (EMD) to decompose the air quality time series data into multiple smooth sub-series; and (3) the truncated Singular Value Decomposition (SVD) to compress and denoise the expanded matrix. Experimental results on the public dataset show that our proposed model outperforms the state-of-art air quality forecasting models in both accuracy and time cost.


menu
Abstract
Full text
Outline
About this article

A Hybrid Air Quality Prediction Model Based on Empirical Mode Decomposition

Show Author's information Yuxuan Cao1Difei Zhang2Shaoqi Ding1Weiyi Zhong1Chao Yan1,3( )
School of Computer Science, Qufu Normal University, Rizhao 276826, China
School of Mathematical Sciences, Qufu Normal University, Qufu 273165, China
College of Economic and Management, Shandong University of Science and Technology, Qingdao 250307, China

Abstract

Air pollution is a severe environmental problem in urban areas. Accurate air quality prediction can help governments and individuals make proper decisions to cope with potential air pollution. As a classic time series forecasting model, the AutoRegressive Integrated Moving Average (ARIMA) has been widely adopted in air quality prediction. However, because of the volatility of air quality and the lack of additional context information, i.e., the spatial relationships among monitor stations, traditional ARIMA models suffer from unstable prediction performance. Though some deep networks can achieve higher accuracy, a mass of training data, heavy computing, and time cost are required. In this paper, we propose a hybrid model to simultaneously predict seven air pollution indicators from multiple monitoring stations. The proposed model consists of three components: (1) an extended ARIMA to predict matrix series of multiple air quality indicators from several adjacent monitoring stations; (2) the Empirical Mode Decomposition (EMD) to decompose the air quality time series data into multiple smooth sub-series; and (3) the truncated Singular Value Decomposition (SVD) to compress and denoise the expanded matrix. Experimental results on the public dataset show that our proposed model outperforms the state-of-art air quality forecasting models in both accuracy and time cost.

Keywords: AutoRegressive Integrated Moving Average (ARIMA), Singular Value Decomposition (SVD), air quality prediction, Empirical Mode Decomposition (EMD)

References(45)

[1]
Y. Zeng, J. Chen, N. Jin, X. Jin, and Y. Du, Air quality forecasting with hybrid LSTM and extended stationary wavelet transform, Build. Environ., vol. 213, p. 108822, 2022.
[2]
World Health Organization, WHO global air quality guidelines: Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide, World Health Organization, https://www.who.int/publications/i/item/9789240034433, 2021.
[3]
P. D. Waggoner, Pandemic policymaking, J. Soc. Comput., vol. 2, no. 1, pp. 14–26, 2021
[4]
F. Fourati and M. S. Alouini, Artificial intelligence for satellite communication: A review, Intell. Conver. Netw., vol. 2, no. 3, pp. 213–243, 2021.
[5]
J. Evans, Social computing unhinged, J. Soc. Comput., vol. 1, no. 1, pp. 1–13, 2020.
[6]
C. Hu, W. Fan, E. Zeng, Z. Hang, F. Wang, L. Qi, and M. Z. A. Bhuiyan, Digital twin-assisted real-time traffic data prediction method for 5G-enabled internet of vehicles, IEEE Trans. Ind. Inform., vol. 18, no. 4, pp. 2811–2819, 2022.
[7]
J. Tie, X. Lei, and Y. Pan, Metabolite-disease association prediction algorithm combining DeepWalk and random forest, Tsinghua Science and Technology, vol. 27, no. 1, pp. 58–67, 2022.
[8]
Y. Liu, Z. Song, X. Xu, W. Rafique, X. Zhang, J. Shen, M. R. Khosravi, and L. Qi, Bidirectional GRU networks-based next POI category prediction for healthcare, Int. J. Intell. Syst., vol. 37, no. 7, pp. 4020–4040, 2022.
[9]
S. Zhang, H. Liu, J. He, S. Han, and X. Du, Deep sequential model for anchor recommendation on live streaming platforms, Big Data Min. Anal., vol. 4, no. 3, pp. 173–182, 2021.
[10]
X. Xu, Q. Jiang, P. Zhang, X. Cao, M. R. Khosravi, L. T. Alex, L. Qi, and W. Dou, Game theory for distributed IoV task offloading with fuzzy neural network in edge computing, IEEE Trans. Fuzzy Syst., vol. 30, no. 11, pp. 4593–4604, 2022.
[11]
A. Agarwal, S. Sharma, V. Kumar, and M. Kaur, Effect of e-learning on public health and environment during COVID-19 lockdown, Big Data Min. Anal., vol. 4 no. 2, pp. 104–115, 2021.
[12]
C. Catlett, P. Beckman, N. Ferrier, H. Nusbaum, M. E. Papka, M. G. Berman, and R. Sankaran, Measuring cities with software-defined sensors, J. Soc. Comput., vol. 1, no. 1, pp. 14–27, 2020.
[13]
N. Ji, L. Ma, H. Dong, and X. Zhang, EEG signals feature extraction based on DWT and EMD combined with approximate entropy, Brain Sci., vol. 9, no. 8, p. 201, 2019.
[14]
C. Yan, Y. Zhang, W. Zhong, C. Zhang, and B. Xin, A truncated SVD-based ARIMA model for multiple QoS prediction in mobile edge computing, Tsinghua Science and Technology, vol. 27, no. 2, pp. 315–324, 2022.
[15]
A. K. Sandhu, Big data with cloud computing: Discussions and challenges, Big Data Min. Anal., vol. 5, no. 1, pp. 32–40, 2022.
[16]
T. Li, C. Li, J. Luo, and L. Song, Wireless recommendations for internet of vehicles: Recent advances, challenges, and opportunities, Intell. Conver. Netw., vol. 1, no. 1, pp. 1–17, 2020.
[17]
M. A. Bouras, F. Farha, and H. Ning, Convergence of computing, communication, and caching in internet of things, Intell. Conver. Netw., vol. 1, no. 1, pp. 18–36, 2020.
[18]
Y. Zhang, H. Zhang, J. Cosmas, N. Jawad, K. Ali, B. Meunier, A. Kapovits, L. K. Huang, W. Li, L. Shi, et al., Internet of radio and light: 5g building network radio and edge architecture, Intell. Conver. Netw., vol. 1, no. 1, pp. 37–57, 2020.
[19]
V. K. Gupta, A. Gupta, D. Kumar, and A. Sardana, Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model, Big Data Mining and Analytics, vol. 4, no. 2, pp. 116–123, 2021.
[20]
H. Zhu and J. Hu, Air quality forecasting using SVR with quasi-linear kernel, in Proc. Int. Conf. Computer, Information and Telecommunication Systems (CITS), Beijing, China, 2019, pp. 1–5.
[21]
Y. T. Tsai, Y. R. Zeng, and Y. S. Chang, Air pollution forecasting using RNN with LSTM, in Proc. IEEE 16th Int. Conf. Dependable, Autonomic and Secure Computing, 16th Int. Conf. Pervasive Intelligence and Computing, 4th Int. Conf. Big Data Intelligence and Computing and Cyber Science and Technology Congress, Athens, Greece, 2018, pp. 1074–1079.
[22]
X. Zhang, X. Rui, X. Xia, X. Bai, W. Yin, and J. Dong, A hybrid model for short-term air pollutant concentration forecasting, in Proc. IEEE Int. Conf. Service Operations and Logistics, and Informatics (SOLI), Yasmine Hammamet, Tunisia, 2015, pp. 171–175.
[23]
P. Wang, H. Zhang, Z. Qin, and G. Zhang, A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting, Atmos. Pollut. Res., vol. 8, no. 5, pp. 850–860, 2017.
[24]
C. Ding, J. Duan, Y. Zhang, X. Wu, and G. Yu, Using an ARIMA-GARCH modeling approach to improve subway short-term ridership forecasting accounting for dynamic volatility, IEEE Trans. Intell. Transp. Syst., vol. 19, no. 4, pp. 1054–1064, 2018.
[25]
S. Moisan, R. Herrera, and A. Clements, A dynamic multiple equation approach for forecasting PM2.5 pollution in Santiago, Chile, Int. J. Forecast., vol. 34, no. 4, pp. 566–581, 2018.
[26]
S. Du, T. Li, Y. Yang, and S. J. Horng, Deep air quality forecasting using hybrid deep learning framework, IEEE Trans. Knowl. Data Eng., vol. 33, no. 6, pp. 2412–2424, 2021.
[27]
Y. Qi, Q. Li, H. Karimian, and D. Liu, A hybrid model for spatiotemporal forecasting of PM2.5 based on graph convolutional neural network and long short-term memory, Sci. Total Environ., vol. 664, pp. 1–10, 2019.
[28]
X. Wang, Y. Zhou, and C. Zhao, Heart-rate analysis of healthy and insomnia groups with detrended fractal dimension feature in edge, Tsinghua Science and Technology, vol. 27, no. 2, pp. 325–332, 2022.
[29]
S. Fan, D. Hao, Y. Feng, K. W. Xia, and W. Yang, A hybrid model for air quality prediction based on data decomposition, Information, vol. 12, no. 5, p. 210, 2021.
[30]
X. B. Jin, N. X. Yang, X. Wang, Y. Bai, T. Su, and J. Kong, Integrated predictor based on decomposition mechanism for PM2.5 long-term prediction, Appl. Sci., vol. 9, no. 21, p. 4533, 2019.
[31]
A. Altıntaş and L. Davidson, EMD-SVR: A hybrid machine learning method to improve the forecasting accuracy of highway tollgates traveling time to improve the road safety, in Proc. 4th Int. Conf. Intelligent Transport Systems, from Research and Development to the Market Uptake, Virtual Event, 2021, pp. 241–251.
[32]
H. Zheng, J. Yuan, and L. Chen, Short-term load forecasting using EMD-LSTM neural networks with a Xgboost algorithm for feature importance evaluation, Energies, vol. 10, no. 8, p. 1168, 2017.
[33]
X. B. Jin, N. X. Yang, X. Wang, Y. Bai, T. Su, and J. Kong, Deep hybrid model based on EMD with classification by frequency characteristics for long-term air quality prediction, Mathematics, vol. 8, no. 2, p. 214, 2020.
[34]
G. Huang, X. Li, B. Zhang, and J. Ren, PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition, Sci. Total Environ., vol. 768, p. 144516, 2021.
[35]
Z. Y. Wang, J. Qiu, and F. Li, Hybrid models combining EMD/EEMD and ARIMA for long-term streamflow forecasting, Water, vol. 10, no. 7, p. 853, 2018.
[36]
N. Fatema, H. Malik, and M. S. Abd Halim, Hybrid approach combining EMD, ARIMA and monte carlo for multi-step ahead medical tourism forecasting, J. Intell. Fuzzy Syst., vol. 42, no. 2, pp. 1235–1251, 2022.
[37]
G. E. P. Box and G. M. Jenkins, Truncated SVD is adopted to capture correlations among air pollutants and neighbor stations. J. Time Ser. Anal., vol. 40, no. 5, pp. 970–971, 1970.
[38]
N. E. Huang, Z. Shen, S. R. Long, M. C. Wu, H. H. Shih, Q. Zheng, N. Yen, C. C. Tung, and H. H. Liu, The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis, Proc. Roy. Soc. A: Math. Phys. Eng. Sci., vol. 454, no. 1971, pp. 903–995, 1998.
[39]
F. Wang, G. Li, Y. Wang, W. Rafique, M. R. Khosravi, G. Liu, Y. Liu, and L. Qi, Privacy-aware traffic flow prediction based on multi-party sensor data with zero trust in smart city, ACM Trans. Internet Technol., .
[40]
Y. Ma, H. Sun, Y. Chen, J. Zhang, Y. Xu, X. Wang, and P. Hui, Deep-predict: A zone preference prediction system for online lodging platforms, J. Soc. Comput., vol. 2, no. 1, pp. 52–70, 2021.
[41]
L. Qi, W. Lin, X. Zhang, W. Dou, X. Xu, and J. Chen, A correlation graph based approach for personalized and compatible web APIs recommendation in mobile APP development, IEEE Trans. Knowl. Data Eng., .
[42]
Y. Yang, X. Yang, M. Heidari, M. A. Khan, G. Srivastava, M. Khosravi, and L. Qi, ASTREAM: Data-stream-driven scalable anomaly detection with accuracy guarantee in IIoT environment, IEEE Trans. Netw. Sci. Eng., .
[43]
L. Qi, Y. Yang, X. Zhou, W. Rafique, and J. Ma, Fast anomaly identification based on multiaspect data streams for intelligent intrusion detection toward secure industry 4.0, IEEE Trans. Ind. Inform., vol. 18, no. 9, pp. 6503–6511, 2022.
[44]
X. Xu, H. Tian, X. Zhang, L. Qi, Q. He, and W. Dou, DisCOV: Distributed COVID-19 detection on X-ray images with edge-cloud collaboration, IEEE Trans. Serv. Comput., vol. 15, no. 3, pp. 1206–1219, 2022.
[45]
J. Ren, J. Li, H. Liu, and T. Qin, Task offloading strategy with emergency handling and blockchain security in SDN-empowered and fog-assisted healthcare IoT, Tsinghua Science and Technology, vol. 27, no. 4, pp. 760–776, 2022.
Publication history
Copyright
Rights and permissions

Publication history

Received: 18 September 2022
Revised: 15 November 2022
Accepted: 26 November 2022
Published: 21 August 2023
Issue date: February 2024

Copyright

© The author(s) 2024.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return