AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (522 KB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Similarity Search Algorithm over Data Supply Chain Based on Key Points

Peng LiHong Luo( )Yan Sun
School of Computer Science, Beijing University of Posts and Telecommunication, Beijing 100876, China.
Beijing Key Lab of Intelligent Telecommunication Software and Multimedia, Beijing 100876, China.
Show Author Information

Abstract

In this paper, we target a similarity search among data supply chains, which plays an essential role in optimizing the supply chain and extending its value. This problem is very challenging for application-oriented data supply chains because the high complexity of the data supply chain makes the computation of similarity extremely complex and inefficient. In this paper, we propose a feature space representation model based on key points, which can extract the key features from the subsequences of the original data supply chain and simplify it into a feature vector form. Then, we formulate the similarity computation of the subsequences based on the multiscale features. Further, we propose an improved hierarchical clustering algorithm for a similarity search over the data supply chains. The main idea is to separate the subsequences into disjoint groups such that each group meets one specific clustering criteria; thus, the cluster containing the query object is the similarity search result. The experimental results show that the proposed approach is both effective and efficient for data supply chain retrieval.

References

[1]
Ribeiro M., Grolinger K., and Capretz M. A. M., MLaaS: Machine learning as a service, in IEEE 14th International Conference on Machine Learning and Applications, 2015, pp. 896-902.
[2]
Cheng Z. H., Yao B., Wang X., and Zhou Z., Web service sub-chain recommendation leveraging graph searching, in 2014 IEEE Computers, Communications and IT Applications Conference, 2014, pp. 271275.
[3]
Groth P., Transparency and reliability in the data supply chain, IEEE Internet Computing, vol. 17, no. 2, pp. 69-71, 2013.
[4]
Ozturk C., Hancer E., and Karaboga D., Dynamic clustering with improved binary artificial bee colony algorithm, Applied Soft Computing Journal, vol. 28, pp. 69-80, 2015.
[5]
Hatamlou A., Black hole: A new heuristic optimization approach for data clustering, Information Sciences, vol. 222, no. 3, pp. 175-184, 2013.
[6]
Cui B., Zhao Z., and Tok W. H., A framework for similarity search of time series cliques with natural relations, IEEE Transactions on Knowledge & Data Engineering, vol. 24, no. 3, pp. 385-398, 2012.
[7]
Yin H., Yang S., Shaodong M. A., Liu F., and Chen Z., A novel parallel scheme for fast similarity search in large time series, China Communications, vol. 12, no. 2, pp. 129-140, 2015.
[8]
Iwashita T., Hochin T., and Nomiya H., Optimal number of clusters for fast similarity search of time series considering transformations, in 2014 IIAI 3rd International Conference on Advanced Applied Informatics, 2014, pp. 711-717.
[9]
Ghassempour S., Girosi F., and Maeder A., Clustering multivariate time series using hidden Markov models, International Journal of Environmental Research & Public Health, vol. 11, no. 3, pp. 2741-2763, 2014.
[10]
Rakthanmanon T. and Keogh E., Fast shapelets: A scalable algorithm for discovering time series shapelets, in 13th SIAM International Conference on Data Mining, 2013, pp. 668-676.
[11]
Karamitopoulos L. and Evangelidis G., Cluster-based similarity search in time series, in 2009 4th Balkan Conference in Informatics, 2009, pp. 113-118.
[12]
Yue P., Di L., Yang W., Yu G., Zhao P., and Gong J., Semantic web services-based process planning for earth science applications, International Journal of Geographical Information Science, vol. 23, no. 9, pp. 1139-1163, 2009.
[13]
Meng S., Dou W., Zhang X., and Chen J., KASR: A keyword-aware service recommendation method on mapreduce for big data applications, IEEE Transactions on Parallel & Distributed Systems, vol. 25, no. 12, pp. 3221-3231, 2014.
[14]
Zhou Z. B., Cheng Z., Ning K., Li W., and Zhang L. J., A sub-chain ranking and recommendation mechanism for facilitating geospatial web service composition, International Journal of Web Services Research, vol. 11, no. 3, pp. 52-75, 2014.
[15]
Singh D. and Reddy C. K., A survey on platforms for big data analytics, Journal of Big Data, vol. 2, no. 1, pp. 1-20, 2015.
[16]
Issac B. and Jap W. J., Implementing spam detection using Bayesian and Porter Stemmer keyword stripping approaches, in IEEE Region 10 Annual International Conference, 2009, pp. 1-5.
[17]
Stokes C., Kumar A., Choi F., and Weischedel R., Scaling NLP algorithms to meet high demand, in 2015 IEEE International Conference on Big Data, 2015, pp. 2839-2839.
[18]
Andoni A. and Onak K., Approximating edit distance in near-linear time, ACM Symposium on Theory of Computing, vol. 41, no. 6, pp. 199-204, 2011.
[19]
Suntinger M., Obweger H., Schiefer J., Limbeck P., and Raidl G., Trend-based similarity search in time-series data, in 2nd International Conference on Advances in Databases, Knowledge and Data Applications, 2010, pp. 97-106.
[20]
Nakamura T., Taki K., Nomiya H., Seki K., and Uehara K., A shape-based similarity measure for time series data with ensemble learning, Pattern Analysis and Applications, vol. 16, no. 4, pp. 535-548, 2013.
[21]
Lang W., Morse M., and Patel J. M., Dictionary-based compression for long time-series similarity, IEEE Transactions on Knowledge & Data Engineering, vol. 22, no. 11, pp. 1609-1622, 2010.
Tsinghua Science and Technology
Pages 174-184
Cite this article:
Li P, Luo H, Sun Y. Similarity Search Algorithm over Data Supply Chain Based on Key Points. Tsinghua Science and Technology, 2017, 22(2): 174-184. https://doi.org/10.23919/TST.2017.7889639

760

Views

29

Downloads

2

Crossref

N/A

Web of Science

3

Scopus

0

CSCD

Altmetrics

Received: 25 November 2016
Revised: 21 December 2016
Accepted: 03 January 2017
Published: 06 April 2017
© The author(s) 2017
Return