Journal Home > Volume 1 , Issue 4

The Dynamic Time Warping (DTW) algorithm is widely used in finding the global alignment of time series. Many time series data mining and analytical problems can be solved by the DTW algorithm. However, using the DTW algorithm to find similar subsequences is computationally expensive or unable to perform accurate analysis. Hence, in the literature, the parallelisation technique is used to speed up the DTW algorithm. However, due to the nature of DTW algorithm, parallelizing this algorithm remains an open challenge. In this paper, we first propose a novel method that finds the similar local subsequence. Our algorithm first searches for the possible start positions of subsequence, and then finds the best-matching alignment from these positions. Moreover, we parallelize the proposed algorithm on GPUs using CUDA and further propose an optimization technique to improve the performance of our parallelization implementation on GPU. We conducted the extensive experiments to evaluate the proposed method. Experimental results demonstrate that the proposed algorithm is able to discover time series subsequences efficiently and that the proposed GPU-based parallelization technique can further speedup the processing.


menu
Abstract
Full text
Outline
About this article

Developing a Pattern Discovery Method in Time Series Data and Its GPU Acceleration

Show Author's information Huanzhou ZhuZhuoer GuHaiming ZhaoKeyang ChenChang-Tsun LiLigang He( )
Department of Computer Science, University of Warwick, Coventry, UK.
School of Computer Science and Telecommunications Engineering, Jiangsu University, Zhengjiang 212013, China.
School of Computing and Mathematics, Charles Sturt University, Wagga Wagga, Australia.

Abstract

The Dynamic Time Warping (DTW) algorithm is widely used in finding the global alignment of time series. Many time series data mining and analytical problems can be solved by the DTW algorithm. However, using the DTW algorithm to find similar subsequences is computationally expensive or unable to perform accurate analysis. Hence, in the literature, the parallelisation technique is used to speed up the DTW algorithm. However, due to the nature of DTW algorithm, parallelizing this algorithm remains an open challenge. In this paper, we first propose a novel method that finds the similar local subsequence. Our algorithm first searches for the possible start positions of subsequence, and then finds the best-matching alignment from these positions. Moreover, we parallelize the proposed algorithm on GPUs using CUDA and further propose an optimization technique to improve the performance of our parallelization implementation on GPU. We conducted the extensive experiments to evaluate the proposed method. Experimental results demonstrate that the proposed algorithm is able to discover time series subsequences efficiently and that the proposed GPU-based parallelization technique can further speedup the processing.

Keywords: GPGPU, data mining, dynamic time warping, time series data, pattern discovery, parallel processing

References(33)

[1]
P. Esling and C. Agon, Time-series data mining, ACM Comput. Surv., vol. 45, no. 1, p. 12, 2012.
[2]
C. T. Li, Y. Y. Yuan, and R. Wilson, An unsupervised conditional random fields approach for clustering gene expression time series, Bioinformatics, vol. 24, no. 21, pp. 2467-2473, 2008.
[3]
K. P. Chan and A. W. C. Fu, Efficient time series matching by wavelets, in Proc. 15th Int. Conf. Data Engineering, Sydney, Australia, 1999, pp. 126-133.
[4]
X. P. Ge and P. Smyth, Deformable Markov model templates for time-series pattern matching, in Proc. 6th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Boston, MA, USA, 2000, pp. 81-90.
DOI
[5]
K. Kalpakis, D. Gada, and V. Puttagunta, Distance measures for effective clustering of ARIMA time-series, in Proc. 2001 IEEE Int. Conf. Data Mining, San Jose, CA, USA, 2001, pp. 273-280.
[6]
J. Lin, E. Keogh, S. Lonardi, and P. Patel, Finding motifs in time series, in Proc. 2nd Workshop on Temporal Data Mining, Edmonton, Canada, 2002, pp. 53-68.
[7]
B. Chiu, E. Keogh, and S. Lonardi, Probabilistic discovery of time series motifs, in Proceedings of the 9th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Washington, DC, USA, 2003, pp. 493-498.
DOI
[8]
G. Das, K. I. Lin, H. Mannila, G. Renganathan, and P. Smyth, Rule discovery from time series, in Proc. 4th Int. Conf. Knowledge Discovery and Data Mining, Menlo Park, CA, USA, 1998, pp. 16-22.
[9]
F. Höppner, Discovery of temporal patterns, in Principles of Data Mining and Knowledge Discovery, L. De Raedt and A. Siebes, eds. Springer, 2001, pp. 192-203.
DOI
[10]
E. J. Keogh and M. J. Pazzani, An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback, in Proc. 4th Int. Conf. Knowledge Discovery and Data Mining, New York, NY, USA, 1998, pp. 239-243.
[11]
M. Hegland, W. Clarke, and M. Kahn, Mining the MACHO dataset, Comput. Phys. Commun., vol. 142, nos. 1-3, pp. 22-28, 2001.
[12]
D. Dasgupta and S. Forrest, Novelty detection in time series data using ideas from immunology, in Proc. 8th Int. Conf. Intelligent Systems, Denver, CO, USA, 1999, pp. 82-87.
[13]
J. W. Han, G. Z. Dong, and Y. W. Yin, Efficient mining of partial periodic patterns in time series database, in Proc. 15th Int. Conf. Data Engineering, Sydney, Australia, 1999, pp. 106-115.
[14]
M. Müller, Information Retrieval for Music and Motion. Springer, 2007.
DOI
[15]
C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, Fast subsequence matching in time-series databases, ACM SIGMOD Rec., vol. 23, no. 2, pp. 419-429, 1994.
[16]
Y. Sakurai, C. Faloutsos, and M. Yamamuro, Stream monitoring under the time warping distance, in Proc. 23rd Int. Conf. Data Engineering, Istanbul, Turkey, 2007, pp. 1046-1055.
DOI
[17]
T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, and E. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in Proc. 18th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Beijing, China, 2012, pp. 262-270.
DOI
[18]
X. J. Wei, C. T. Li, Z. Lei, D. Yi, and S. Z. Li, Dynamic image-to-class warping for occluded face recognition, IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2035-2050, 2014.
[19]
M. Toyoda and Y. Sakurai, Discovery of cross-similarity in data streams, in Proc. 26th Int. Conf. Data Engineering, Long Beach, CA, USA, 2010, pp. 101-104.
DOI
[20]
T. Nishii, T. Hiroyasu, M. Yoshimi, M. Miki, and H. Yokouchi, Similar subsequence retrieval from two time series data using homology search, in Proc. 2010 IEEE Int. Conf. Systems Man and Cybernetics, Istanbul, Turkey, 2010, pp. 1062-1067.
DOI
[21]
L. M. Xiao, Y. Zheng, W. Q. Tang, G. C. Yao, and L. Ruan, Parallelizing dynamic time warping algorithm using prefix computations on GPU, in Proc. 10th Int. Conf. High Performance Computing and Communications & 2013 IEEE Int. Conf. Embedded and Ubiquitous Computing, Zhangjiajie, China, 2013, pp. 294-299.
DOI
[22]
D. Sart, A. Mueen, W. Najjar, E. Keogh, and V. Niennattrakul, Accelerating dynamic time warping subsequence search with GPUs and FPGAs, in Proc. 10th Int. Conf. Data Mining, Sydney, Australia, 2010, pp. 1001-1006.
DOI
[23]
Y. D. Zhang, K. Adl, and J. Glass, Fast spoken query detection using lower-bound dynamic time warping on graphical processing units, in Proc. 2012 IEEE Int. Conf. Acoustics, Speech and Signal Processing, Kyoto, Japan, 2012, pp. 5173-5176.
DOI
[24]
A. Vahdatpour, N. Amini, and M. Sarrafzadeh, Toward unsupervised activity discovery using multi-dimensional motif detection in time series, in Proc. 21st Int. Joint Conf. Artificial Intelligence, Pasadena, CA, USA, 2009, pp. 1261-1266.
[25]
A. Mueen, E. J. Keogh, Q. Zhu, S. Cash, and M. B. Westover, Exact discovery of time series motifs, in 2009 SIAM Int. Conf. Data Mining, Sparks, NV, USA, 2009, pp. 473-484.
DOI
[26]
M. Toyoda, Y. Sakurai, and Y. Ishikawa, Pattern discovery in data streams under the time warping distance, VLDB J., vol. 22, no. 3, pp. 295-318, 2013.
[27]
S. Srikanthan, A. Kumar, and R. Gupta, Implementing the dynamic time warping algorithm in multithreaded environments for real time and unsupervised pattern discovery, in Proc. 2nd Int. Conf. Computer and Communication Technology, Allahabad, India, 2011, pp. 394-398.
DOI
[28]
N. Takahashi, T. Yoshihisa, Y. Sakurai, and M. Kanazawa, A parallelized data stream processing system using dynamic time warping distance, in Proc. 2009 Int. Conf. Complex, Intelligent and Software Intensive Systems, Fukuoka, Japan, 2009, pp. 1100-1105.
DOI
[29]
NVIDIA Corporation, NVIDIA CUDA C Programming Guide, 2011.
[30]
R. J. Hyndman, Time series data library, https://robjhyndman.com/TSDL/, 2016.
[31]
Y. P. Chen, E. Keogh, B. Hu, N. Begum, A. Bagnall, A. Mueen, and G. Batista, The UCR time series classification archive, www.cs.ucr.edu/~eamonn/time_series_data/, 2015.
[32]
C. Reiss, J. Wilkes, and J. L. Hellerstein, Google cluster-usage traces: Format + Schema, Technical report, Google Inc., Mountain View, CA, USA, November 2011, revised 2014-11-17 for version 2.1, https://github.com/google/cluster-data.
[33]
R. S. Stankovic and B. J. Falkowski, The Haar wavelet transform: Its status and achievements, Comput. Electr. Eng., vol. 29, no. 1, pp. 25-44, 2003.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 03 January 2018
Accepted: 02 March 2018
Published: 02 July 2018
Issue date: December 2018

Copyright

© The author(s) 2018

Acknowledgements

This research was supported by the National Natural Science Foundation of China (No. 61602215), the Science Foundation of Jiangsu Province (No. BK20150527), the EU Horizon 2020 — Marie Sklodowska-Curie Actions through the project entitled Computer Vision Enabled Multimedia Forensics and People Identification (Project No. 690907, Acronym: IDENTITY).

Rights and permissions

Return