Journal Home > Volume 24 , Issue 2

Skyline group, also named as combinational skyline or group-based skyline, has attracted more attention recently. The concept of skyline groups is proposed to address the problem in the inadequacy of the traditional skyline to answer queries that need to analyze not only individual points but also groups of points. Skyline group algorithms aim at finding groups of points that are not dominated by any other same-size groups. Although two types of dominance relationship exist between the groups defined in existing works, they have not been compared systematically under the same experimental framework. Thus, practitioners face difficulty in selecting an appropriate definition. Furthermore, the experimental evaluation in most existing works features a weakness, that is, studies only experimented on small data sets or large data sets with small dimensions. For comprehensive comparisons of the two types of definition and existing algorithms, we evaluate each algorithm in terms of time and space on various synthetic and real data sets. We reveal the characteristics of existing algorithms and provide guidelines on selecting algorithms for different situations.


menu
Abstract
Full text
Outline
About this article

Computing Skyline Groups: An Experimental Evaluation

Show Author's information Haoyang ZhuXiaoyong Li( )Qiang LiuHao Zhu
Academy of Military Science of the People’s Liberation Army, Beijing 100091, China.
Academy of Ocean Science and Engineering, National University of Defense Technology, Changsha 410073, China.
College of Computer, National University of Defense Technology, Changsha 410073, China.

Abstract

Skyline group, also named as combinational skyline or group-based skyline, has attracted more attention recently. The concept of skyline groups is proposed to address the problem in the inadequacy of the traditional skyline to answer queries that need to analyze not only individual points but also groups of points. Skyline group algorithms aim at finding groups of points that are not dominated by any other same-size groups. Although two types of dominance relationship exist between the groups defined in existing works, they have not been compared systematically under the same experimental framework. Thus, practitioners face difficulty in selecting an appropriate definition. Furthermore, the experimental evaluation in most existing works features a weakness, that is, studies only experimented on small data sets or large data sets with small dimensions. For comprehensive comparisons of the two types of definition and existing algorithms, we evaluate each algorithm in terms of time and space on various synthetic and real data sets. We reveal the characteristics of existing algorithms and provide guidelines on selecting algorithms for different situations.

Keywords: performance evaluation, skyline queries, skyline groups

References(25)

[1]
Im H. and Park S., Group skyline computation, Inf. Sci., vol. 188, no. 1, pp. 151-169, 2012.
[2]
Li C., Zhang N., Hassan N., Rajasekaran S., and Das G., On skyline groups, in 21st International Conference on Information and Knowledge Management, 2012, pp. 2119- 2123.
DOI
[3]
Chung Y., Su I., and Lee C., Efficient computation of combinatorial skyline queries, Inf. Syst., vol. 38, no. 3, pp. 369-387, 2013.
[4]
Liu J., Xiong L., Pei J., Luo J., and Zhang H., Finding pareto optimal groups: Group-based skyline, PVLDB, vol. 8, no. 13, pp. 2086-2097, 2015.
[5]
Zhang N., Li C., Hassan N., Rajasekaran S., and Das G., On skyline groups, IEEE Trans. Knowl. Data Eng., vol. 26, no. 4, pp. 942-956, 2014.
[6]
Su I., Chung Y., and Lee C., Top-kcombinatorial skyline queries, in Database Systems for Advanced Applications, 15th International Conference, 2010, pp. 79-93.
DOI
[7]
Zhu H., Zhu P., Li X., and Liu Q., Top-k skyline groups queries, in Proceedings of the 20th International Conference on Extending Database Technology, 2017, pp. 442-445.
DOI
[8]
Guo X., Li H., Wulamu A., Xie Y., and Fu Y., Efficient processing of skyline group queries over a data stream, Tsinghua Science and Technology, vol. 21, no. 1, pp. 29-39, 2016.
[9]
Zhu H., Zhu P., Li X., and Liu Q., Computing skyline groups: An experimental evaluation, in Proceedings of the ACM Turing 50th Celebration Conference, 2017, pp. 1-48.
DOI
[10]
B¨orzs¨onyi S., Kossmann D., and Stocker K., The skyline operator, in Proceedings of the 17th International Conference on Data Engineering, 2001, pp. 421-430.
[11]
Zhang P., Zhou C., Wang P., Gao B. J., Zhu X., and Guo L., E-tree: An efficient indexing structure for ensemble models on data streams, IEEE Trans. Knowl. Data Eng., vol. 27, no. 2, pp. 461-474, 2015.
[12]
Kossmann D., Ramsak F., and Rost S., Shooting stars in the sky: An online algorithm for skyline queries, in Proceedings of 28th International Conference on Very Large Data Bases, 2002, pp. 275-286.
DOI
[13]
Li X., Wang Y., Li X., Wang X., and Yu J., GDPS: An efficient approach for skyline queries over distributed uncertain data, Big Data Research, vol. 1, no.1, pp. 23-36, 2014.
[14]
Li X., Wang Y., Li X., and Wang Y., Parallel skyline queries over uncertain data streams in cloud computing environments, IJWGS, vol. 10, no. 1, pp. 24-53, 2014.
[15]
Lee K. C. K., Zheng B., Li H., and Lee W., Approaching the skyline in Z order, in Proceedings of the 33rd International Conference on Very Large Data Bases, 2007, pp. 279-290.
[16]
Chomicki J., Godfrey P., Gryz J., and Liang D., Skyline with presorting, in Proceedings of the 19th International Conference on Data Engineering, 2003, pp. 717-719.
[17]
Bartolini I., Ciaccia P., and Patella M., Efficient sort-based skyline evaluation, ACM Trans. Database Syst., vol. 33, no. 4, pp. 1-31, 2008.
[18]
Zhang S., Mamoulis N., and Cheung D. W., Scalable skyline computation using object-based space partitioning, in Proceedings of the International Conference on Management of Data, 2009, pp. 483-494.
DOI
[19]
Lee J. and Hwang S., Scalable skyline computation using a balanced pivot selection technique, Inf. Syst., vol. 39, no. 1, pp. 1-21, 2014.
[20]
Pei J., Jiang B., Lin X., and Yuan Y., Probabilistic skylines on uncertain data, in Proceedings of the 33rd International Conference on Very Large Data Bases, 2007, pp. 15-26.
[21]
Lu H., Jensen C. S., and Zhang Z., Flexible and efficient resolution of skyline query size constraints, IEEE Trans. Knowl. Data Eng., vol. 23, no. 7, pp. 991-1005, 2011.
[22]
Yuan Y., Lin X., Liu Q., Wang W., Yu J., and Zhang Q., Efficient computation of the skyline cube, in Proceedings of the 31st International Conference on Very Large Data Bases, 2005, pp. 241-252.
[23]
Zhang Q., Ye P., Lin X., and Zhang Y., Skyline probability over uncertain preferences, in Joint 2013 EDBT/ICDT Conferences, 2013, pp. 395-405.
DOI
[24]
Zhang X., Li G., and Feng J., Crowd-sourced top-k algorithms: An experimental evaluation, PVLDB, vol. 9, no. 8, pp. 612-623, 2016.
[25]
Chester S., Sidlauskas D., Assent I., and Bøgh K. S., Scalable parallelization of skyline computation for multi-core processors, in 31st IEEE International Conference on Data Engineering, 2015, pp. 1083-1094.
DOI
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 22 May 2017
Revised: 28 December 2017
Accepted: 10 January 2018
Published: 31 December 2018
Issue date: April 2019

Copyright

© The author(s) 2019

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Nos. 61502511, 61501482, and 61502513).

Rights and permissions

Return