B. Cao, J. W. Wu, L. C. Cao, Y. S. Xu, and J. Fan, Long-term and multi-step ahead call traffic forecasting with temporal features mining, Mobile Netw. Appl., vol. 25, no. 2, pp. 701-712, 2020.
X. R. Shao, C. S. Kim, and P. Sontakke, Accurate deep model for electricity consumption forecasting using multi-channel and multi-scale feature fusion CNN-LSTM, Energies, vol. 13, no. 8, p. 1881, 2020.
X. W. Yi, J. B. Zhang, Z. Y. Wang, T. R. Li, and Y. Zheng, Deep distributed fusion network for air quality prediction, in Proc. 24th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 965-973.
S. L. Ho and M. Xie, The use of ARIMA models for reliability forecasting and analysis, Computers & Industrial Engineering, vol. 35, nos. 1&2, pp. 213-216, 1998.
C. H. Liu, S. C. Hoi, P. L. Zhao, and J. L. Sun, Online ARIMA algorithms for time series prediction, in Proc. 30th AAAI Conf. Artificial Intelligence, Phoenix, AR, USA, 2016, pp. 1867-1873.
G. E. P. Box and D. A. Pierce, Distribution of residual autocorrelations in autoregressive-integrated moving average time series models, J. Am. Stat. Assoc., vol. 65, no. 332, pp. 1509-1526, 1970.
G. Dudek, Short-term load forecasting using random forests, in Intelligent Systems’2014, D. Filev, J. Jablkowski, J. Kacprzyk, M. Krawczak, I. Popchev, L. Rutkowski, V. Sgurev, E. Sotirova, P. Szynkarczyk, and S. Zadrozny, eds. Cham, Germany: Springer, 2015, pp. 821-828.
N. I. Sapankevych and R. Sankar, Time series prediction using support vector machines: A survey, IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 24-38, 2009.
B. Krawczyk, Learning from imbalanced data: Open challenges and future directions, Prog. Artif. Intell., vol. 5, no. 4, pp. 221-232, 2016.
X. Y. Liu, J. X. Wu, and Z. H. Zhou, Exploratory undersampling for class-imbalance learning, IEEE Trans. Syst., Man, Cybern., Part B (Cybern.), vol. 39, no. 2, pp. 539-550, 2009.
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, SMOTE: Synthetic minority over-sampling technique, J. Artif. Intell. Res., vol. 16, pp. 321-357, 2002.
Y. M. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, Cost-sensitive boosting for classification of imbalanced data, Pattern Recognit., vol. 40, no. 12, pp. 3358-3378, 2007.
S. H. Khan, M. Hayat, F. Sohel, and R. Togneri. Cost-sensitive learning of deep feature representations from imbalanced data, IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 8, pp. 3573-3587, 2017.
N. Moniz, P. Branco, and L. Torgo, Resampling strategies for imbalanced time series forecasting, Int. J. Data Sci. Anal., vol. 3, no. 3, pp. 161-181, 2017.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, Attention is all you need, in Proc. 31st Int. Conf. Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 6000-6010.
K. M. He, X. Y. Zhang, S. Q. Ren, and J. Sun, Deep residual learning for image recognition, in Proc. IEEE Conf. Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016, pp. 770-778.
J. L. Ba, J. R. Kiros, and G. E. Hinton, Layer normalization, arXiv preprint arXiv: 1607.06450, 2016.
V. Nair and G. E. Hinton, Rectified linear units improve restricted Boltzmann machines, in Proc. 27th Int. Conf. Machine Learning, Haifa, Israel, 2010, pp. 807-814.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997.
I. Sutskever, O. Vinyals, and Q. V. Le, Sequence to sequence learning with neural networks, in Proc. 27th Int. Conf. Neural Information Processing Systems, Montreal, Canada, 2014, pp. 3104-3112.
G. K. Lai, W. C. Chang, Y. M. Yang, and H. X. Liu, Modeling long- and short-term temporal patterns with deep neural networks, in Proc. 41st Int. ACM SIGIR Conf. Research & Development in Information Retrieval, Ann Arbor, MI, USA, 2018, pp. 95-104.
B. N. Oreshkin, D. Carpov, N. Chapados, and Y. Bengio, N-BEATS: Neural basis expansion analysis for interpretable time series forecasting, arXiv preprint arXiv:1905.10437, 2019.
D. P. Kingma and J. L. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv: 1412.6980, 2014.
L. Bianchi, J. Jarrett, and R. C. Hanumara, Improving forecasting for telemarketing centers by ARIMA modeling with intervention, Int. J. Forecasting, vol. 14, no. 4, pp. 497-504, 1998.
J. Contreras, R. Espinola, F. J. Nogales, and A. J. Conejo, ARIMA models to predict next-day electricity prices, IEEE Trans. Power Syst., vol. 18, no. 3, pp. 1014-1020, 2003.
E. D. Feigelson, G. J. Babu, and G. A. Caceres, Autoregressive times series methods for time domain astronomy, Front. Phys., vol. 6, p. 80, 2018.
C. J. Lu, T. S. Lee, and C. C. Chiu, Financial time series forecasting using independent component analysis and support vector regression, Decis. Support Syst., vol. 47, no. 2, pp. 115-125, 2009.
W. C. Kong, Z. Y. Dong, Y. W. Jia, D. J. Hill, Y. Xu, and Y. Zhang, Short-term residential load forecasting based on LSTM recurrent neural network, IEEE Trans. Smart Grid, vol. 10, no. 1, pp. 841-851, 2019.
S. S. Rangapuram, M. Seeger, J. Gasthaus, L. Stella, Y. Y. Wang, and T. Januschowski, Deep state space models for time series forecasting, in Proc. 32nd Int. Conf. Neural Information Processing Systems, Montréal, Canada, 2018, pp. 7796-7805.
G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, A study of the behavior of several methods for balancing machine learning training data, ACM SIGKDD Explorat. Newsl., vol. 6, no. 1, pp. 20-29, 2004.
C. Elkan, The foundations of cost-sensitive learning, in Proc. 17th Int. Joint Conf. Artificial Intelligence, Seattle, WA, USA, 2001, pp. 973-978.
S. J. Wang, W. Liu, J. Wu, L. B. Cao, Q. X. Meng, and P. J. Kennedy, Training deep neural networks on imbalanced data sets, in Proc. Int. Joint Conf. Neural Networks, Vancouver, Canada, 2016, pp. 4368-4374.
T. Y. Lin, P. Goyal, R. Girshick, K. M. He, and P. Dollár, Focal loss for dense object detection, in Proc. IEEE Int. Conf. Computer Vision, Venice, Italy, 2017, pp. 2999-3007.
H. S. Wang, Z. C. Cui, Y. X. Chen, M. Avidan, A. B. Abdallah, and A. Kronzer, Predicting hospital readmission via cost-sensitive deep learning, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 15, no. 6, pp. 1968-1978, 2018.