Performance Prediction for Performance-Sensitive Queries Based on Algorithmic Complexity

Chihung Chi; Ye Zhou; Xiaojun Ye

doi:10.1109/TST.2013.6678907

Tsinghua Science and Technology 2013, 18(6): 618-628 https://doi.org/10.1109/TST.2013.6678907

Open Access | Issue | Published: 06 December 2013

Performance Prediction for Performance-Sensitive Queries Based on Algorithmic Complexity

Show Author's Information Hide Author's Information Chihung Chi(

), Ye Zhou, Xiaojun Ye

School of Software, Tsinghua University, Beijing 100084, China

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Keywords:

query performance, data size, query complexity, performance-sensitive query

Cite this article:

Chi C, Zhou Y, Ye X. Performance Prediction for Performance-Sensitive Queries Based on Algorithmic Complexity. Tsinghua Science and Technology, 2013, 18(6): 618-628. https://doi.org/10.1109/TST.2013.6678907

Download citation

EndNote(RIS)

BibTeX

363

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Performance predictions for database queries allow service providers to determine what resources are needed to ensure their performance. Cost-based or rule-based approaches have been proposed to optimize database query execution plans. However, Virtual Machine (VM)-based database services have little or no sharing of resources or interactions between applications hosted on shared infrastructures. Neither providers nor users have the right combination of visibility/access/expertise to perform proper tuning and provisioning. This paper presents a performance prediction model for query execution time estimates based on the query complexity for various data sizes. The user query execution time is a combination of five basic operator complexities: $O (1)$ , $O (\log (n))$ , $O (n)$ , $O (n \log (n))$ , and $O (n^{2})$ . Moreover, tests indicate that not all queries are equally important for performance prediction. As such, this paper illustrates a performance-sensitive query locating process on three benchmarks: RUBiS, RUBBoS, and TPC-W. A key observation is that performance-sensitive queries are only a small proportion ( $20 %$ ) of the application query set. Evaluation of the performance model on the TPC-W benchmark shows that the query complexity in a real life scenario has an average prediction error rate of less than $10 %$ which demonstrates the effectiveness of this predictive model.

Full text

Abstract

Full text

Outline

About this article

Performance Prediction for Performance-Sensitive Queries Based on Algorithmic Complexity

Show Author's information Hide Author's Information Chihung Chi(

), Ye Zhou, Xiaojun Ye

School of Software, Tsinghua University, Beijing 100084, China

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Abstract

Keywords: query performance, data size, query complexity, performance-sensitive query

References(29)

[1]

M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. H. Katz, A. Konwinski, G. Lee, D. A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, Above the clouds: A berkeley view of cloud computing, Communications of The ACM - CACM, vol. 53, no. 4, pp. 50-58, April 2010.

DOI Google Scholar

[2]

R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility, Future Generation Computer Systems, vol. 25, no. 6, pp. 599-616, June 2009.

DOI Google Scholar

[3]

Amazon EC2 website, http://aws.amazon.com/ec2/, 2009.

[4]

Forrester Research Website, http://www.akamai.com/2seconds, 2009.

[5]

Z. Wei, J. D. Jun, C. C. Hung, and M. V. Steen, Service-oriented data denormalization for scalable web applications, in Proc. 17th Int. World Wide Web Conf., Beijing, China, 2008, pp. 267-276.

DOI

[6]

C. Loboz, S. Smyl, and S. Nath, DataGarage: Warehousing massive performance data on commodity servers, in Proc. 36th VLDB Endowment, Singapore, 2010, pp. 1447-1458.

DOI

[7]

J. D. Jun, Performance guarantees for web applications, Ph.D. Dissertation, Dept. Computer Science, Vrije Universiteit, Amsterdam, 2011.

DOI

[8]

E. M. Voorhees, The TREC robust retrieval track, in Proc. 28th Annu. Int. ACM SIGIR Conf., Salvador, Brazil, 2005, pp. 11-20.

DOI

[9]

D. Florescu and D. Kossman, Rethinking cost and performance of database systems, in Proc. 35th ACM SIGMOD Int. Conf. on Management of Data, Indiana, USA, 2009, pp. 43-48.

DOI

[10]

Q. Guo, R. W. White, S. T. Dumais, J. Wang, and B. Anderson, Predicting query performance using query, result, and user interaction features, in Proc. 9th RIAO Conf. Adaptivity, Personalization and Fusion of Heterogeneous Information, Paris, France, 2010, pp.198-201.

[11]

C. D. Cranor, T. Johnson, O. Spataschek, and V. Shkapenyuk, Gigascope: A stream database for network applications, in Proc. 29th ACM SIGMOD Int. Conf. on Management of Data, Rhode Island, USA, 2003, pp. 647-651.

DOI

[12]

Amazon RDS, http://aws.amazon.com/rds/, 2009.

[13]

Microsoft SQL Azure, http://www.microsoft.com/windowsazure/features/database/, 2011.

[14]

S. Chaudhuri, A. C. Konig, and V. Narasayya, Database monitoring system, US Patent: 7194451B2, Feb.26, 2004.

[15]

C. Donald, C. Ugur, C. Mitch, C. Christian, L. Sangdon, S. Greg, S. Michael, and T. Nesime, Monitoring streams ¨C a new class of data management applications, in Proc. 28th Int. Conf. on Very Large Data Bases, Hong Kong, China, 2002, pp. 215-226.

DOI

[16]

E. Y. Tov, S. Fine, D. Carmel, and A. Darlow, Learning to estimate query difficulty, in Proc. 28th Annu. Int. ACM SIGIR Conf., Salvador, Brazil, 2005, pp. 512-519.

[17]

M. Josiane and T. Ludovic, Linguistic features to predict query difficulty —- A case study on previous TREC campaigns, presented at ACM SIGIR’05 Query Prediction Workshop on Predicting Query Difficulty —- Methods and Applications, Bahia, Brazil, 2005.

[18]

H. G. Molina, J. D. Ullman, and J. Widom, Database System Implementation. Prentice Hall, 1999.

DOI

[19]

O. M. Tamer and V. Patrick, Principles of Distributed Database Systems. Prentice Hall, 1999.

DOI

[20]

B. Mozafari, C. Curino, A. Jindal, and S. Madden, Performance and resource modeling in highly-concurrent OLTP workloads, in Proc. 37th ACM SIGMOD Int. Conf. on Management of Data, New York, USA, 2013, pp. 330-341.

[21]

RUBiS, http://rubis.ow2.org/, 2004.

[22]

RUBBoS, http://jmob.ow2.org/rubbos.html, 2005.

[23]

TPC-W Specification, Version 1.8, http://www.tpc.org/tpcw/, 2000.

[24]

TPC-W Implementation, http://www.ece.wisc.edu/~pharm/tpcw.shtml, 2002.

[25]

A. Ganapathi, H. A. Kuno, U. Dayal, J. L. Wiener, A. Fox, M. I. Jordan, and D. A. Patterson, Predicting multiple metrics for queries: Better decisions enabled by machine learning, in Proc. 25th Int. Conf. on Data Engineering, Shanghai, China, 2009, pp. 592-603.

DOI

[26]

J. Duggan, U. Cetintemel, O. Papaemmanouil, and E. Upfal, Performance prediction for concurrent database workloads, in Proc. 37th ACM SIGMOD Int. Conf. on Management of Data, Athens, Greece, 2011, pp. 337-348.

DOI

[27]

D. W. Marquardt, An algorithm for least-squares estimation of nonlinear parameters, Siam Journal on Applied Mathematics, vol. 11, no. 1, pp. 431-441, 1963.

DOI Google Scholar

[28]

B. Mozafari, C. Curino, and S. Madden, DBSeer: Resource and performance prediction for building a next generation database cloud, in Proc. 6th Biennial Conf. on Innovative Data Systems Research, California, USA, 2013, pp. 162-165.

[29]

K. Yagoub, P. Belknap, B. Dageville, K. Dias, S. Joshi, and H. Yu, Oracle’s SQL performance analyzer, IEEE Data(base) Engineering Bulletin, vol. 31, no. 1, pp. 51-58, March, 2008.

Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 08 October 2012

Revised: 13 May 2013

Accepted: 08 June 2013

Published: 06 December 2013

Issue date: December 2013

Copyright

Acknowledgements

We thank Shuo Chen and Dejun Jiang for their generous support.