Journal Home > Volume 4 , Issue 4

Time series forecasting has attracted wide attention in recent decades. However, some time series are imbalanced and show different patterns between special and normal periods, leading to the prediction accuracy degradation of special periods. In this paper, we aim to develop a unified model to alleviate the imbalance and thus improving the prediction accuracy for special periods. This task is challenging because of two reasons: (1) the temporal dependency of series, and (2) the tradeoff between mining similar patterns and distinguishing different distributions between different periods. To tackle these issues, we propose a self-attention-based time-varying prediction model with a two-stage training strategy. First, we use an encoder-“decoder module with the multi-head self-attention mechanism to extract common patterns of time series. Then, we propose a time-varying optimization module to optimize the results of special periods and eliminate the imbalance. Moreover, we propose reverse distance attention in place of traditional dot attention to highlight the importance of similar historical values to forecast results. Finally, extensive experiments show that our model performs better than other baselines in terms of mean absolute error and mean absolute percentage error.


menu
Abstract
Full text
Outline
About this article

A Deep-Learning Prediction Model for Imbalanced Time Series Data Forecasting

Show Author's information Chenyu HouJiawei WuBin CaoJing Fan( )
College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou 310023, China

Abstract

Time series forecasting has attracted wide attention in recent decades. However, some time series are imbalanced and show different patterns between special and normal periods, leading to the prediction accuracy degradation of special periods. In this paper, we aim to develop a unified model to alleviate the imbalance and thus improving the prediction accuracy for special periods. This task is challenging because of two reasons: (1) the temporal dependency of series, and (2) the tradeoff between mining similar patterns and distinguishing different distributions between different periods. To tackle these issues, we propose a self-attention-based time-varying prediction model with a two-stage training strategy. First, we use an encoder-“decoder module with the multi-head self-attention mechanism to extract common patterns of time series. Then, we propose a time-varying optimization module to optimize the results of special periods and eliminate the imbalance. Moreover, we propose reverse distance attention in place of traditional dot attention to highlight the importance of similar historical values to forecast results. Finally, extensive experiments show that our model performs better than other baselines in terms of mean absolute error and mean absolute percentage error.

Keywords: deep learning, time series forecasting, imbalanced data, prediction model

References(34)

[1]
B. Cao, J. W. Wu, L. C. Cao, Y. S. Xu, and J. Fan, Long-term and multi-step ahead call traffic forecasting with temporal features mining, Mobile Netw. Appl., vol. 25, no. 2, pp. 701-712, 2020.
[2]
X. R. Shao, C. S. Kim, and P. Sontakke, Accurate deep model for electricity consumption forecasting using multi-channel and multi-scale feature fusion CNN-LSTM, Energies, vol. 13, no. 8, p. 1881, 2020.
[3]
X. W. Yi, J. B. Zhang, Z. Y. Wang, T. R. Li, and Y. Zheng, Deep distributed fusion network for air quality prediction, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 965-973.
DOI
[4]
S. L. Ho and M. Xie, The use of ARIMA models for reliability forecasting and analysis, Computers & Industrial Engineering, vol. 35, nos. 1&2, pp. 213-216, 1998.
[5]
C. H. Liu, S. C. Hoi, P. L. Zhao, and J. L. Sun, Online ARIMA algorithms for time series prediction, in Proc. 30th AAAI Conf. Artificial Intelligence, Phoenix, AR, USA, 2016, pp. 1867-1873.
[6]
G. E. P. Box and D. A. Pierce, Distribution of residual autocorrelations in autoregressive-integrated moving average time series models, J. Am. Stat. Assoc., vol. 65, no. 332, pp. 1509-1526, 1970.
[7]
G. Dudek, Short-term load forecasting using random forests, in Intelligent Systems’2014, D. Filev, J. Jablkowski, J. Kacprzyk, M. Krawczak, I. Popchev, L. Rutkowski, V. Sgurev, E. Sotirova, P. Szynkarczyk, and S. Zadrozny, eds. Cham, Germany: Springer, 2015, pp. 821-828.
DOI
[8]
N. I. Sapankevych and R. Sankar, Time series prediction using support vector machines: A survey, IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 24-38, 2009.
[9]
B. Krawczyk, Learning from imbalanced data: Open challenges and future directions, Prog. Artif. Intell., vol. 5, no. 4, pp. 221-232, 2016.
[10]
X. Y. Liu, J. X. Wu, and Z. H. Zhou, Exploratory undersampling for class-imbalance learning, IEEE Trans. Syst., Man, Cybern., Part B (Cybern.), vol. 39, no. 2, pp. 539-550, 2009.
[11]
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, J. Artif. Intell. Res., vol. 16, pp. 321-357, 2002.
[12]
Y. M. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, Cost-sensitive boosting for classification of imbalanced data, Pattern Recognit., vol. 40, no. 12, pp. 3358-3378, 2007.
[13]
S. H. Khan, M. Hayat, F. Sohel, and R. Togneri. Cost-sensitive learning of deep feature representations from imbalanced data, IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3573-3587, 2017.
[14]
N. Moniz, P. Branco, and L. Torgo, Resampling strategies for imbalanced time series forecasting, Int. J. Data Sci. Anal., vol. 3, no. 3, pp. 161-181, 2017.
[15]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000-6010.
[16]
K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770-778.
DOI
[17]
J. L. Ba, J. R. Kiros, and G. E. Hinton, Layer normalization, arXiv preprint arXiv: 1607.06450, 2016.
[18]
V. Nair and G. E. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proc. 27th Int. Conf. Machine Learning, Haifa, Israel, 2010, pp. 807-814.
[19]
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997.
[20]
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3104-3112.
[21]
G. K. Lai, W. C. Chang, Y. M. Yang, and H. X. Liu, Modeling long- and short-term temporal patterns with deep neural networks, in Proc. 41st Int. ACM SIGIR Conf. Research & Development in Information Retrieval, Ann Arbor, MI, USA, 2018, pp. 95-104.
DOI
[22]
B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, N-BEATS: Neural basis expansion analysis for interpretable time series forecasting, arXiv preprint arXiv:1905.10437, 2019.
[23]
D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv: 1412.6980, 2014.
[24]
L. Bianchi, J. Jarrett, and R. C. Hanumara, Improving forecasting for telemarketing centers by ARIMA modeling with intervention, Int. J. Forecasting, vol. 14, no. 4, pp. 497-504, 1998.
[25]
J. Contreras, R. Espinola, F. J. Nogales, and A. J. Conejo, ARIMA models to predict next-day electricity prices, IEEE Trans. Power Syst., vol. 18, no. 3, pp. 1014-1020, 2003.
[26]
E. D. Feigelson, G. J. Babu, and G. A. Caceres, Autoregressive times series methods for time domain astronomy, Front. Phys., vol. 6, p. 80, 2018.
[27]
C. J. Lu, T. S. Lee, and C. C. Chiu, Financial time series forecasting using independent component analysis and support vector regression, Decis. Support Syst., vol. 47, no. 2, pp. 115-125, 2009.
[28]
W. C. Kong, Z. Y. Dong, Y. W. Jia, D. J. Hill, Y. Xu, and Y. Zhang, Short-term residential load forecasting based on LSTM recurrent neural network, IEEE Trans. Smart Grid, vol. 10, no. 1, pp. 841-851, 2019.
[29]
S. S. Rangapuram, M. Seeger, J. Gasthaus, L. Stella, Y. Y. Wang, and T. Januschowski, Deep state space models for time series forecasting, in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 7796-7805.
[30]
G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explorat. Newsl., vol. 6, no. 1, pp. 20-29, 2004.
[31]
C. Elkan, The foundations of cost-sensitive learning, in Proc. 17th Int. Joint Conf. Artificial Intelligence, Seattle, WA, USA, 2001, pp. 973-978.
[32]
S. J. Wang, W. Liu, J. Wu, L. B. Cao, Q. X. Meng, and P. J. Kennedy, Training deep neural networks on imbalanced data sets, in Proc. Int. Joint Conf. Neural Networks, Vancouver, Canada, 2016, pp. 4368-4374.
DOI
[33]
T. Y. Lin, P. Goyal, R. Girshick, K. M. He, and P. Dollár, Focal loss for dense object detection, in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2999-3007.
DOI
[34]
H. S. Wang, Z. C. Cui, Y. X. Chen, M. Avidan, A. B. Abdallah, and A. Kronzer, Predicting hospital readmission via cost-sensitive deep learning, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 15, no. 6, pp. 1968-1978, 2018.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 21 May 2021
Accepted: 10 June 2021
Published: 26 August 2021
Issue date: December 2021

Copyright

© The author(s) 2021

Acknowledgements

This research was partially sponsored by the National Key R&D Program of China (No. 2018YFB1402800) and the Fundamental Research Funds for the Provincial Universities of Zhejiang (No. RF-A2020007).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).

Return