Journal Home > Volume 18 , Issue 6

Performance predictions for database queries allow service providers to determine what resources are needed to ensure their performance. Cost-based or rule-based approaches have been proposed to optimize database query execution plans. However, Virtual Machine (VM)-based database services have little or no sharing of resources or interactions between applications hosted on shared infrastructures. Neither providers nor users have the right combination of visibility/access/expertise to perform proper tuning and provisioning. This paper presents a performance prediction model for query execution time estimates based on the query complexity for various data sizes. The user query execution time is a combination of five basic operator complexities: O(1), O(log(n)), O(n), O(nlog(n)), and O(n2). Moreover, tests indicate that not all queries are equally important for performance prediction. As such, this paper illustrates a performance-sensitive query locating process on three benchmarks: RUBiS, RUBBoS, and TPC-W. A key observation is that performance-sensitive queries are only a small proportion ( 20%) of the application query set. Evaluation of the performance model on the TPC-W benchmark shows that the query complexity in a real life scenario has an average prediction error rate of less than 10% which demonstrates the effectiveness of this predictive model.


menu
Abstract
Full text
Outline
About this article

Performance Prediction for Performance-Sensitive Queries Based on Algorithmic Complexity

Show Author's information Chihung Chi( )Ye ZhouXiaojun Ye
School of Software, Tsinghua University, Beijing 100084, China
Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Abstract

Performance predictions for database queries allow service providers to determine what resources are needed to ensure their performance. Cost-based or rule-based approaches have been proposed to optimize database query execution plans. However, Virtual Machine (VM)-based database services have little or no sharing of resources or interactions between applications hosted on shared infrastructures. Neither providers nor users have the right combination of visibility/access/expertise to perform proper tuning and provisioning. This paper presents a performance prediction model for query execution time estimates based on the query complexity for various data sizes. The user query execution time is a combination of five basic operator complexities: O(1), O(log(n)), O(n), O(nlog(n)), and O(n2). Moreover, tests indicate that not all queries are equally important for performance prediction. As such, this paper illustrates a performance-sensitive query locating process on three benchmarks: RUBiS, RUBBoS, and TPC-W. A key observation is that performance-sensitive queries are only a small proportion ( 20%) of the application query set. Evaluation of the performance model on the TPC-W benchmark shows that the query complexity in a real life scenario has an average prediction error rate of less than 10% which demonstrates the effectiveness of this predictive model.

Keywords: query performance, data size, query complexity, performance-sensitive query

References(29)

[1]
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, Above the clouds: A berkeley view of cloud computing, Communications of The ACM - CACM, vol. 53, no. 4, pp. 50-58, April 2010.
[2]
R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems, vol. 25, no. 6, pp. 599-616, June 2009.
[3]
Amazon EC2 website, http://aws.amazon.com/ec2/, 2009.
[4]
Forrester Research Website, http://www.akamai.com/2seconds, 2009.
[5]
Z. Wei, J. D. Jun, C. C. Hung, and M. V. Steen, Service-oriented data denormalization for scalable web applications, in Proc. 17th Int. World Wide Web Conf., Beijing, China, 2008, pp. 267-276.
DOI
[6]
C. Loboz, S. Smyl, and S. Nath, DataGarage: Warehousing massive performance data on commodity servers, in Proc. 36th VLDB Endowment, Singapore, 2010, pp. 1447-1458.
DOI
[7]
J. D. Jun, Performance guarantees for web applications, Ph.D. Dissertation, Dept. Computer Science, Vrije Universiteit, Amsterdam, 2011.
DOI
[8]
E. M. Voorhees, The TREC robust retrieval track, in Proc. 28th Annu. Int. ACM SIGIR Conf., Salvador, Brazil, 2005, pp. 11-20.
DOI
[9]
D. Florescu and D. Kossman, Rethinking cost and performance of database systems, in Proc. 35th ACM SIGMOD Int. Conf. on Management of Data, Indiana, USA, 2009, pp. 43-48.
DOI
[10]
Q. Guo, R. W. White, S. T. Dumais, J. Wang, and B. Anderson, Predicting query performance using query, result, and user interaction features, in Proc. 9th RIAO Conf. Adaptivity, Personalization and Fusion of Heterogeneous Information, Paris, France, 2010, pp.198-201.
[11]
C. D. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk, Gigascope: A stream database for network applications, in Proc. 29th ACM SIGMOD Int. Conf. on Management of Data, Rhode Island, USA, 2003, pp. 647-651.
DOI
[12]
Amazon RDS, http://aws.amazon.com/rds/, 2009.
[13]
[14]
S. Chaudhuri, A. C. Konig, and V. Narasayya, Database monitoring system, US Patent: 7194451B2, Feb.26, 2004.
[15]
C. Donald, C. Ugur, C. Mitch, C. Christian, L. Sangdon, S. Greg, S. Michael, and T. Nesime, Monitoring streams ¨C a new class of data management applications, in Proc. 28th Int. Conf. on Very Large Data Bases, Hong Kong, China, 2002, pp. 215-226.
DOI
[16]
E. Y. Tov, S. Fine, D. Carmel, and A. Darlow, Learning to estimate query difficulty, in Proc. 28th Annu. Int. ACM SIGIR Conf., Salvador, Brazil, 2005, pp. 512-519.
[17]
M. Josiane and T. Ludovic, Linguistic features to predict query difficulty —- A case study on previous TREC campaigns, presented at ACM SIGIR’05 Query Prediction Workshop on Predicting Query Difficulty —- Methods and Applications, Bahia, Brazil, 2005.
[18]
H. G. Molina, J. D. Ullman, and J. Widom, Database System Implementation. Prentice Hall, 1999.
DOI
[19]
O. M. Tamer and V. Patrick, Principles of Distributed Database Systems. Prentice Hall, 1999.
DOI
[20]
B. Mozafari, C. Curino, A. Jindal, and S. Madden, Performance and resource modeling in highly-concurrent OLTP workloads, in Proc. 37th ACM SIGMOD Int. Conf. on Management of Data, New York, USA, 2013, pp. 330-341.
[21]
RUBiS, http://rubis.ow2.org/, 2004.
[22]
[23]
TPC-W Specification, Version 1.8, http://www.tpc.org/tpcw/, 2000.
[24]
TPC-W Implementation, http://www.ece.wisc.edu/~pharm/tpcw.shtml, 2002.
[25]
A. Ganapathi, H. A. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. I. Jordan, and D. A. Patterson, Predicting multiple metrics for queries: Better decisions enabled by machine learning, in Proc. 25th Int. Conf. on Data Engineering, Shanghai, China, 2009, pp. 592-603.
DOI
[26]
J. Duggan, U. Cetintemel, O. Papaemmanouil, and E. Upfal, Performance prediction for concurrent database workloads, in Proc. 37th ACM SIGMOD Int. Conf. on Management of Data, Athens, Greece, 2011, pp. 337-348.
DOI
[27]
D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, Siam Journal on Applied Mathematics, vol. 11, no. 1, pp. 431-441, 1963.
[28]
B. Mozafari, C. Curino, and S. Madden, DBSeer: Resource and performance prediction for building a next generation database cloud, in Proc. 6th Biennial Conf. on Innovative Data Systems Research, California, USA, 2013, pp. 162-165.
[29]
K. Yagoub, P. Belknap, B. Dageville, K. Dias, S. Joshi, and H. Yu, Oracle’s SQL performance analyzer, IEEE Data(base) Engineering Bulletin, vol. 31, no. 1, pp. 51-58, March, 2008.
Publication history
Copyright
Acknowledgements
Rights and permissions

Publication history

Received: 08 October 2012
Revised: 13 May 2013
Accepted: 08 June 2013
Published: 06 December 2013
Issue date: December 2013

Copyright

© The author(s) 2013

Acknowledgements

We thank Shuo Chen and Dejun Jiang for their generous support.

Rights and permissions

Return