Multi-Clock Snapshot Isolation Concurrency Control for NVM Database

Xuyang Liu; Kang Chen; Mengxing Liu; Shiyu Cai; Yongwei Wu; Weimin Zheng

doi:10.26599/TST.2021.9010036

Tsinghua Science and Technology 2022, 27(6): 925-938 https://doi.org/10.26599/TST.2021.9010036

Open Access | Issue | Published: 21 June 2022

Multi-Clock Snapshot Isolation Concurrency Control for NVM Database

Show Author's Information Hide Author's Information Xuyang Liu, Kang Chen(

), Mengxing Liu, Shiyu Cai, Yongwei Wu, Weimin Zheng

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Keywords:

Non-Volatile Memory (NVM), snapshot isolation, Multi-Version Concurrency Control (MVCC), vector clock

Cite this article:

Liu X, Chen K, Liu M, et al. Multi-Clock Snapshot Isolation Concurrency Control for NVM Database. Tsinghua Science and Technology, 2022, 27(6): 925-938. https://doi.org/10.26599/TST.2021.9010036

Download citation

EndNote(RIS)

BibTeX

707

Views

Downloads

Citations

Crossref

WoS

Scopus

CSCD

Abstract Full text About this article

Abstract

Multi-Clock Snapshot Isolation (MCSI) is a concurrency control mechanism that implements snapshot isolation on a single-layer Non-Volatile Memory (NVM) database. It stores a single copy of data by using multi-version storage to ensure durability and runtime access. With multi-clock transaction timestamp assignment, MCSI can efficiently generate snapshots with vector clocks and use per-thread transaction status arrays to identify uncommitted versions in NVM. For evaluation, we compared MCSI with the PostgreSQL-style concurrency control used in the single-layer NVM database N2DB. The maximum transaction throughput of MCSI is 101%–195% higher than that of N2DB for the YCSB workloads, and 25%–49% higher for the TPC-C workloads. Moreover, the transaction latency of MCSI remains relatively stable as the thread count increases. With 18 worker threads, the average transaction latency of MCSI is 65%–84% lower than that of N2DB for the YCSB workloads and 16%–43% lower for the TPC-C workloads.

Full text

Abstract

Full text

Outline

About this article

Multi-Clock Snapshot Isolation Concurrency Control for NVM Database

Show Author's information Hide Author's Information Xuyang Liu, Kang Chen(

), Mengxing Liu, Shiyu Cai, Yongwei Wu, Weimin Zheng

Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Abstract

Keywords: Non-Volatile Memory (NVM), snapshot isolation, Multi-Version Concurrency Control (MVCC), vector clock

References(34)

[1]

Andrei

, Lemke

, Radestock

, Schulze

, Thiel

, Blanco

, Meghlan

, Sharique

, Seifert

, Vishnoi

, et al., SAP HANA adoption of non-volatile memory, Proceedings of the VLDB Endowment, vol. 10, no. 12, pp. 1754–1765, 2017.10.14778/3137765.3137780

DOI Google Scholar

[2]

A. Eisenman, D. Gardner, I. AbdelRahman, J. Axboe, S. Y. Dong, K. Hazelwood, C. Petersen, A. Cidon, and S. Katti, Reducing DRAM footprint with NVM in Facebook, in Proc. 13^th EuroSys Conf., Porto, Portugal, 2018, p. 42.

DOI Google Scholar

[3]

Kim

J. H.

, Kim

, Kang

, Lee

C. G.

, Park

, and Kim

, pNOVA: Optimizing shared file I/O operations of NVM file system on manycore servers, in Proc. 10^th ACM SIGOPS Asia-Pacific Workshop on Systems, Hangzhou, China, 2019, pp. 1–7.10.1145/3343737.3343748

DOI Google Scholar

[4]

DeBrabant

, Arulraj

, Pavlo

, Stonebraker

, Zdonik

, and Dulloor

S. R.

, A prolegomenon on OLTP database systems for non-volatile memory, in ADMS’14, Hangzhou, China, 2014, pp. 57–63.

Google Scholar

[5]

A. van Renen, V. Leis, A. Kemper, T. Neumann, T. Hashida, K. Oe, Y. Doi, L. Harada, and M. Sato, Managing non-volatile memory in database systems, in Proc. 2018 Int. Conf. Management of Data, Houston, TX, USA, 2018, pp. 1541–1555.

DOI Google Scholar

[6]

R. Fang, H. I. Hsiao, B. He, C. Mohan, and Y. Wang, High performance database logging using storage class memory, in 2011 IEEE 27^th Int. Conf. Data Engineering, Hannover, Germany, 2011, pp. 1221–1231.

DOI Google Scholar

[7]

S. Gao, J. L. Xu, B. S. He, B. Choi, and H. B. Hu, PCMLogging: Reducing transaction logging overhead with PCM, in Proc. 20^th ACM Int. Conf. Information and Knowledge Management, Glasgow, UK, 2011, pp. 2401–2404.

DOI Google Scholar

[8]

T. Z. Wang and R. Johnson, Scalable logging through emerging non-volatile memory, Proceedings of the VLDB Endowment, vol. 7, no. 10, pp. 865–876, 2014.

DOI Google Scholar

[9]

J. Arulraj, J. Levandoski, U. F. Minhas, and P. A. Larson, BzTree: A high-performance latch-free range index for non-volatile memory, Proceedings of the VLDB Endowment, vol. 11, no. 5, pp. 553–565, 2018.

DOI Google Scholar

[10]

X. J. Zhou, L. D. Shou, K. Chen, W. Hu, and G. Chen, DPTree: Differential indexing for persistent memory, Proceedings of the VLDB Endowment, vol. 13, no. 4, pp. 421–434, 2019.

DOI Google Scholar

[11]

S. N. Ma, K. Chen, S. M. Chen, M. X. Liu, J. L. Zhu, H. B. Kang, and Y. W. Wu, ROART: Range-query optimized persistent ART, in 19^th USENIX Conf. File and Storage Technologies, Santa Clara, CA, USA, 2021, pp. 1–16.

Google Scholar

[12]

I. Oukid, D. Booss, W. Lehner, P. Bumbulis, and T. Willhalm, SOFORT: A hybrid SCM-DRAM storage engine for fast data recovery, in Proc. 10^th Int. Workshop on Data Management on New Hardware, Snowbird, UT, USA, 2014, p. 8.

DOI Google Scholar

[13]

J. Arulraj, M. Perron, and A. Pavlo, Write-behind logging, Proceedings of the VLDB Endowment, vol. 10, no. 4, pp. 337–348, 2016.

DOI Google Scholar

[14]

M. Liu, Concurrency control for non-volatile memory systems, (in Chinese), PhD dissertation, Department of Computer Science and Technology, Tsinghua University, Beijing, China, 2020.

[15]

H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O’Neil, and P. O’Neil, A critique of ANSI SQL isolation levels, ACM SIGMOD Record, vol. 24, no. 2, pp. 1–10, 1995.

DOI Google Scholar

[16]

Oracle Database, https://www.oracle.com/database/, 2020.

[17]

PostgreSQL, https://www.postgresql.org/, 2020.

[18]

Microsoft SQL Server, https://www.microsoft.com/sql-server, 2020.

[19]

M. J. Cahill, U. Röhm, and A. D. Fekete, Serializable isolation for snapshot databases, ACM Transactions on Database Systems, vol. 34, no. 4, p. 20, 2009.

DOI Google Scholar

[20]

T. Z. Wang, R. Johnson, A. Fekete, and I. Pandis, Efficiently making (almost) any concurrency control mechanism serializable, The VLDB Journal, vol. 26, no. 4, pp. 537–562, 2017.

DOI Google Scholar

[21]

J. Yang, J. Kim, M. Hoseinzadeh, J. Izraelevitz, and S. Swanson, An empirical guide to the behavior and use of scalable persistent memory, in 18^th USENIX Conf. File and Storage Technologies, Santa Clara, CA, USA, 2020, pp. 169–182.

Google Scholar

[22]

M. Grund, J. Krüger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden, Hyrise: A main memory hybrid storage engine, Proceedings of the VLDB Endowment, vol. 4, no. 2, pp. 105–116, 2010.

DOI Google Scholar

[23]

C. Diaconu, C. Freedman, E. Ismert, P. A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling, Hekaton: SQL server’s memory-optimized OLTP engine, in Proc. 2013 ACM SIGMOD Int. Conf. Management of Data, New York, NY, USA, 2013, pp. 1243–1254.

DOI Google Scholar

[24]

T. Neumann, T. Mühlbauer, and A. Kemper, Fast serializable multi-version concurrency control for main-memory database systems, in Proc. 2015 ACM SIGMOD Int. Conf. Management of Data, Melbourne, Australia, 2015, pp. 677–689.

DOI Google Scholar

[25]

J. Lee, M. Muehle, N. May, F. Faerber, V. Sikka, H. Plattner, J. Krueger, and M. Grund, High-performance transaction processing in SAP HANA, Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol. 36, no. 2, pp. 28–33, 2013.

Google Scholar

[26]

Y. J. Wu, J. Arulraj, J. X. Lin, R. Xian, and A. Pavlo, An empirical evaluation of in-memory multi-version concurrency control, Proceedings of the VLDB Endowment, vol. 10, no. 7, pp. 781–792, 2017.

DOI Google Scholar

[27]

C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz, ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging, ACM Transactions on Database Systems, vol. 17, no. 1, pp. 94–162, 1992.

DOI Google Scholar

[28]

H. Kimura, FOEDUS: OLTP engine for a thousand cores and NVRAM, in Proc. 2015 ACM SIGMOD Int. Conf. Management of Data, Melbourne, Australia, 2015, pp. 691–706.

DOI Google Scholar

[29]

G. Liu, L. Y. Chen, and S. M. Chen, Zen: A high-throughput log-free OLTP engine for non-volatile main memory, Proceedings of the VLDB Endowment, vol. 14, no. 5, pp. 835–848, 2021.

DOI Google Scholar

[30]

J. Izraelevitz, H. Mendes, and M. L. Scott, Linearizability of persistent memory objects under a full-system-crash failure model, in Int. Symp. Distributed Computing, Paris, France, 2016, pp. 313–327.

DOI Google Scholar

[31]

T. David, A. Dragojevi, R. Guerraoui, and I. Zablotchi, Log-free concurrent data structures, in Proc. 2018 USENIX Annu. Technical Conf., Boston, MA, USA, 2018, pp. 373–385.

Google Scholar

[32]

B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears, Benchmarking cloud serving systems with YCSB, in Proc. 1^st ACM Symp. Cloud Computing, Indianapolis, IN, USA, 2010, pp. 143–154.

DOI Google Scholar

[33]

TPC benchmark C, http://www.tpc.org/tpcc/, 2010.

[34]

J. Y. Gu, Q. Q. Yu, X. Y. Wang, Z. G. Wang, B. Y. Zang, H. B. Guan, and H. B. Chen, Pisces: A scalable and efficient persistent transactional memory, in 2019 USENIX Annu. Technical Conf., Renton, WA, USA, 2019, pp. 913–928.

Google Scholar

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 03 February 2021

Revised: 06 March 2021

Accepted: 06 May 2021

Published: 21 June 2022

Issue date: December 2022

Copyright

Acknowledgements

This work was supported by the National Key Research & Development Program of China (No. 2016YFB1000504) and the National Natural Science Foundation of China (Nos. 61877035, 61433008, 61373145, and 61572280).

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).