Journal Home > Volume 21 , Issue 1

In this paper, we study the skyline group problem over a data stream. An object can dominate another object if it is not worse than the other object on all attributes and is better than the other object on at least one attribute. If an object cannot be dominated by any other object, it is a skyline object. The skyline group problem involves finding k-item groups that cannot be dominated by any other k-item group. Existing algorithms designed to find skyline groups can only process static data. However, data changes as a stream with time in many applications, and algorithms should be designed to support skyline group queries on dynamic data. In this paper, we propose new algorithms to find skyline groups over a data stream. We use data structures, namely a hash table, dominance graph, and matrix, to store dominance information and update results incrementally. We conduct experiments on synthetic datasets to evaluate the performance of the proposed algorithms. The experimental results show that our algorithms can efficiently find skyline groups over a data stream.


menu
Abstract
Full text
Outline
About this article

Efficient Processing of Skyline Group Queries over a Data Stream

Show Author's information Xi GuoHailing LiAziguli Wulamu( )Yonghong XieYajing Fu
Beijing Key Laboratory of Knowledge Engineering for Materials Science, University of Science and Technology Beijing, Beijing 100083, China.

Abstract

In this paper, we study the skyline group problem over a data stream. An object can dominate another object if it is not worse than the other object on all attributes and is better than the other object on at least one attribute. If an object cannot be dominated by any other object, it is a skyline object. The skyline group problem involves finding k-item groups that cannot be dominated by any other k-item group. Existing algorithms designed to find skyline groups can only process static data. However, data changes as a stream with time in many applications, and algorithms should be designed to support skyline group queries on dynamic data. In this paper, we propose new algorithms to find skyline groups over a data stream. We use data structures, namely a hash table, dominance graph, and matrix, to store dominance information and update results incrementally. We conduct experiments on synthetic datasets to evaluate the performance of the proposed algorithms. The experimental results show that our algorithms can efficiently find skyline groups over a data stream.

References(18)

[1]
Xu C. and Gao Y., A novel approach for selecting the top skyline under users’ references, in Information Management and Engineering (ICIME), 2010 The 2nd IEEE International Conference on, 2010, pp. 708–712.
DOI
[2]
Im H. and Park S., Group skyline computation, Information Sciences, vol. 188, pp. 151-169, 2011.
[3]
Li C., Rajasekaran S., Zhang N., Hassan N., and Das G., On skyline groups, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 26, no. 4, pp. 942-956, 2014.
[4]
Börzsönyi S. and Stocker D. K. K., The skyline operator, in Proceedings of the International Conference on Data Engineering (ICDE), 2001, pp. 421–430.
[5]
Chomicki J., Godfrey P., Gryz J., and Liang D., Skyline with presorting, in Proceedings of the International Conference on Data Engineering (ICDE), 2003, pp. 717–719.
[6]
Chaudhuri S., Dalvi N., and Kaushik R., Robust cardinality and cost estimation for skyline operator, in Proceedings of the International Conference on Data Engineering (ICDE), 2006, p. 64.
DOI
[7]
Tao Y. and Papadias D., Maintaining sliding window skylines on data streams, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 18, no. 3, pp. 377- 391, 2006.
[8]
Liu X., Yang D., Ye M., and Lee W., U-skyline: A new skyline query for uncertain databases, IEEE Transactions on Knowledge and Data Engineering (TKDE), vol. 25, no. 4, pp. 945-960, 2013.
[9]
Li Y., Qu W., Li Z., Xu Y., Ji C., and Wu J., Sky line query based on user preference with mapreduce, in 2014 IEEE 12th International Conference on Dependable, Autonomic and Secure Computing, 2014, pp. 153–158.
DOI
[10]
Godfrey P., Shipley R., and Gryz J., Maximal vector computation in large data sets, in Proceedings of International Conference on Very Large Data Bases (VLDB), 2005, pp. 229–240.
[11]
Lee K. C. K., Lee W., Zheng B., Li H., and Tian Y., Z-sky: An efficient skyline query processing framework based on z-order, The VLDB Journal, vol. 19, no. 3, pp. 333-362, 2010.
[12]
Kossemann D., Ramsak F., and Rost S., Shooting stars in the sky: An online algorithm for skyline queries, in Proceedings of International Conference on Very Large Data Bases (VLDB), 2002, pp. 275–286.
DOI
[13]
Papadias D., Tao Y., Fu G., and Seeger B., An optimal and progressive algorithm for skyline queries, in ACM SIGMOD 2003, 2003, pp. 467–478.
DOI
[14]
Ercha A., Huang W., Liu S., Shi L., Gong J., Chen Y., and Shen H., A regional ionospheric tec mapping technique over China and adjacent areas: Gnss data processing and dineof analysis, SCIENCE CHINA Information Sciences, vol. 58, no. 10, p. 107201, 2015.
[15]
Li X., Wang H., Ding B., and Li X., Mabpan optimal resource allocation approach in data center networks, SCIENCE CHINA Information Sciences, vol. 57, no. 16, p. 102801, 2014.
[16]
Aggarwal C. C., ed., Data Streams: Models and Algorithms. Springer, 2007.
DOI
[17]
Morse M., Patel J. M., and Grosky W. I., Efficient continuous skyline computation, in Proceedings of the International Conference on Data Engineering (ICDE), 2006, p. 108.
DOI
[18]
Lee Y., Lee K., and Kim M. H., Efficient processing of multiple continuous skyline queries over a data stream, Information Sciences, vol. 221, pp. 316-337, 2013.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 01 September 2015
Accepted: 13 October 2015
Published: 04 February 2016
Issue date: February 2016

Copyright

© The author(s) 2016

Acknowledgements

This work was supported by the Fundamental Research Funds for the Central Universities (Nos. FRF-TP-14-025A1 and FRF-TP-15-025A2). This work was also supported by the Key Technologies Research and Development Program of 12th Five-Year Plan of China (No. 2013BAI13B06). The authors would like to thank Jiayu Du for revising the language errors.

Rights and permissions

Return